You are here: Home » Content » Relational Algebra
Quality
Affiliated with  (?)
This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
Lenses
Tags  (?)
These tags come from the endorsement, affiliation, and other lenses that include this content.

Relational Algebra

Module by: Nguyen Kim Anh

Summary: In this lecture, we will discuss the first formal languages for the relational models: Relational Algebra

In this lecture, we will discuss the first formal languages for the relational models: Relational Algebra

Relational Algebra

Relational Algebra (RA) can be viewed as a data manipulation language for relational model. It consists of several basic operations which is enable user to specify retrieval requests. RA is called a procedural language in which user need to specify how to retrieve the expected data.
Relational Algebra has the following components:
  • Operands: Relations or Variables that represent relations
  • Operators that map relations to relations
  • Rules for combining operands and operators to relational algebra expression
  • Rules for evaluating those expressions
Operations of relational algebra include the followings:
  • Union, Intersect, Set Difference, Cartesian Product are operations based on set theory
  • Select, Project, Join, Division are operations developed especially for relational databases.
Figure 1: Example Database

Relational Algebra Operations from Set Theory

Definition: Two relations r(A1, A2, …, An) and s(B1, B2, …, Bn) are union compatible if they have the same degree n and dom(Ai) = dom(Bi) for 1 ≤ i ≤ n.
This mean two union compatible relations have the same number of attributes and each corresponding pair of attributes have the same domain
  1. UNION Operation
The UNION operation combines two union compatible relations into a single relation via set union of sets of tuples.
  • Notation: r1r2r1r2 size 12{r1` union `r2`} {}
Figure 2: Union Operation Notation
  • r1r2={ttr1tr2}r1r2={ttr1tr2} size 12{r1` union `r2= lbrace t \lline t in r1 or t in r2 rbrace } {} where r1(R) and r2(R)
  • Result size: {}r1r2r1+r2r1r2r1+r2 size 12{ \lline r1` union `r2 \lline ` <= ` \lline r1 \lline + \lline r2 \lline } {}
  • Result schema: R
  • Producing the result of UNION
    • Make a copy of relation r1
    • For each tuple t in relation r2, check whether it is in the result or not. If it is not already in the result then place it there.
  • Example:
Figure 3: Union Operation Example
  1. INTERSECTION Operation
The INTERSECTION operation combines two union compatible relations into a single relation via set intersection of sets of tuples.
  • Notation: r1r2r1r2 size 12{r1` intersection `r2} {}
Figure 4: Intersection Operation Notation
  • r1r2={ttr1tr2}r1r2={ttr1tr2} size 12{r1` intersection `r2= lbrace t \lline t in r1 and t in r2 rbrace } {} where r1(R) and r2(R)
  • Result size: r1r2min(r1,r2)r1r2min(r1,r2) size 12{ \lline r1` intersection `r2 \lline ` <= `"min" \( \lline r1 \lline , \lline r2 \lline \) rbrace } {}
  • Result schema: R
  • Producing the result of INTERSECTION
    • Initially, result set is empty
    • For each tuple t in relation r1, if t is in the relation r2 then place t in the result set.
  • Example
Figure 5: Intersection Operation Example
  1. SET DIFFERENCE Operation
The DIFFERENCE operation finds the set of tuples that exist in one relation but do not occur in the other union compatible relation
  • Notation: r1 \ r2
Figure 6: Difference Operation Notation
  • r1={ttr1tr2}r1={ttr1tr2} size 12{r1`\`r2= lbrace t \lline t in r1` and `t notin r2 rbrace } {} where r1(R) and r2(R)
  • Result schema: R
  • Producing the result of the DIFFERENCE operation
    • Initially, result set is empty
    • For each tuple in r1, check whether it appear in r2 or not. If it does not then place it in the result set. Otherwise, ignores it
  • Example
Figure 7: Difference Operation Example
  1. CARTESIAN PRODUCT Operation
The PRODUCT operation combines information from two relations pairwise on tuples.
  • Notation: r x s
  • r×s={(t1,t2)t1rt2s}r×s={(t1,t2)t1rt2s} size 12{r` times `s`=` lbrace \( t1`,`t2 \) ` \lline `t1 in r and `t2 in `s rbrace } {} where r(R) and s(S)
  • Each tuple in the result contains all attributes in r and s, possibly with some fields renamed to avoid ambiguity. The result set contains all possible tuple that can be construct from one tuple in r and one tuple in s.
  • Result schema: If we have R(A1, A2, …, An) and S(B1, B2, …, Bm) then the list of attributes in Result is (A1, A2, …, An, B1, B2, …, Bm)
  • Result size: r×s=rsr×s=rs size 12{ \lline r` times s \lline `=` \lline r \lline `*` \lline s \lline } {}
  • Producing the result of PRODUCT operation:
    • For each tuple in r, form new tuples by pair it with each tuple in s
    • Place all of these new tuples in the result set
  • Example
Figure 8: Catesian Product Operation Notation
Figure 9: Catesian Product Operation Example
  • Note: As we can notice, the CARTESIAN PRODUCT operation by itself is not a useful querying mechanism since the result size is large. However, it is an extremely important operation of relational algebra since it is the basic mechanism for combining information across relations. We will discuss about this topic in more detail in Query Processing lecture.

Other Relational Algebra Operations

  1. SELECT Operation
The SELECT operation is an unary operation. It means the input of this operation is only one relation and it output is also a relation.
The SELECT operation returns a subset of the tuples from a relation that satisfies a selection condition. The SELECT operation can be viewed as a horizontal filter of the relation. It partitions the input relation into two sets of tuples: those tuples that satisfies the condition are select, those do not satisfy the condition are discarded.
  • Notation: σselectioncondition>(r)σselectioncondition>(r) size 12{σ rSub { size 8{ ital "selection"` - ` ital "condition"}} \( r \) } {}
Figure 10: Select Operation Notation
  • σ F ( r ) = { t t r F ( t ) } σ F ( r ) = { t t r F ( t ) } size 12{σ rSub { size 8{F} } \( r \) `=` lbrace t \lline `t in `r and `F \( t \) rbrace `} {} where r(R) and F is a boolean expression on attributes in R
The selection condition is made up of a number of clauses of the form
  • <attribute name> <comparison op> <constant value> OR
  • <attribute name 1> <comparison op> <attribute name 2>
In the clause, the comparison operations could be one of the following: ≤, ≥, ≠, =, >, < . Clauses are connected by Boolean operators : and, or , not
  • Result size σF(r)rσF(r)r size 12{ \lline σ rSub { size 8{F} } \( r \) \lline <= ` \lline r \lline `} {}
  • Result schema: R
  • Producing the result of the SELECT operation
    • Selection condition F is evaluate for each tuple in r, with the attribute variables in F set to their values in the tuples
    • Any tuple t that F(t) = true is placed in the result set
    • Other tuples are not include in the result.
  • Example: Retrieve the Id, Name, Suburb of students who live in Bundoora
Figure 11: Select Operation Example - 1
  • Example: Retrieve the Id, Name, Suburd of student who’s name is Mary or students who live in Bundoora
Figure 12: Select Operation Example - 2
  1. PROJECT Operation
The PROJECT operation is another unary operation. This operation returns a set of tuples containing a subset of the attributes in the original relation. Thus, as we state that the SELECT operation selects some rows and discards the others. The PROJECT operation, on the other hand, selects some columns of the relation and discards the other column. The PROJECT operation can be viewed as the vertical filter of the relation.
  • Notation: πattributelist(r)πattributelist(r) size 12{π rSub { size 8{ ital "attribute"` - ital "list"} } \( r \) } {}
Figure 13: Project Operation Notation
  • π X ( r ) = { t [ X ] t r } π X ( r ) = { t [ X ] t r } size 12{π rSub { size 8{X} } \( r \) `=` lbrace t` \[ X \] ` \lline t in `r rbrace ` } {} where r(R)
  • Result size: πX(r)rπX(r)r size 12{ \lline π rSub { size 8{X} } \( r \) \lline <= ` \lline r \lline `} {}
  • Result schema: R’(X)
  • Producing the result of PROJECT operation
    • Take each tuple in the original relation, extract the values of the specified attributes
    • Form new tuple from these values and place the new tuple in the result if it is not already there. This steps includes the duplicate removal phase, this makes the result of PROJECT operation a relation
  • Example: Retrieve the suburbs that are stored in database
Figure 14: Project Operation Example - 1
  • Retrieve the name of the subjects and department which is responsible for the subject
Figure 15: Project Operation Example - 2
  1. JOIN Operation
The JOIN operation is used to combine related tuples from two relations into a single tuple. The Theta-JOIN is a specialized product containing only pairs that match on a supplied condition called join-condition.
  • Notation: rjoinconditionsrjoinconditions size 12{r` ⊳ ⊲ rSub { size 8{ ital "join"` - ` ital "condition"} } s} {}
  • rCs={(t1,t2)t1rt2sC(t1,t2)}rCs={(t1,t2)t1rt2sC(t1,t2)} size 12{r` ⊳ ⊲ rSub { size 8{C} } s`=` lbrace \( t1,t2 \) ` \lline t1 in r and t2 in s and C \( t1,t2 \) rbrace `} {}where r(R) , s(S)
  • Similar to PRODUCT, each tuple in the result of JOIN operation contains all attributes from two original relations. However, in this operation one tuple in R and one tuple in S can be combined together to form a tuple in the result if the combination satisfies the join condition.
  • Join condition is of the form:
<condition> AND <condition> AND …AND <condition>
where <condition> is a comparision between one attribute in R and one attribute in S, provided that these two attributes have the same domain.
  • Result size: rCsrsrCsrs size 12{ \lline r` ⊳ ⊲ rSub { size 8{C} } `s \lline ` <= ` \lline r \lline `*` \lline s \lline } {}
  • Result schema: If we have R(A1, A2, …, An) and S(B1, B2, …, Bm) then the list of attributes in Result is (A1, A2, …, An, B1, B2, …, Bm)
  • Producing the result of JOIN operation:
    • For each tuple in r, form new tuples by pair it with each tuple in s
    • If the new tuple satisfies the specified condition, then place it in the result set.
  • Example:
Figure 16: Join Operation Example - 1
Figure 17: Join Operation Example - 2
Figure 18: Join Operation Example - 3
Variations of JOIN
EQUI-JOIN: A JOIN where the only comparision operator used in the join condition is “=” is called EQUI-JOIN. The result of Equi Join always has one or more pairs of attributes that have identical values in every tuple.
Example:
Figure 19: Natural Join Operation Example - 1
NATURAL JOIN: The Natural Join operation is a specialised product where the result tuple contains only pairs of tuples that match on their common attributes with one of each pair of common attributes is eliminated. The standard definition of Natural Join requires that the two join attributes have the same name. Therefore, we can see that Natural Join is created to get rid of the duplicate columns in an Equi Join.
  • Notation: r * s
  • Natural Join can be defined using other operation rs=πRS(σcondition(r×s))rs=πRS(σcondition(r×s)) size 12{r*s=π rSub { size 8{r union s} } \( σ rSub { size 8{ ital "condition"} } \( r times s \) \) } {}
where r(R) and s(S) and condition is boolean expression (A1 = B1) AND (A2 = B2 ) AND … AND (Ak = Bk) with Ai is the attribute in r , Bi is the attribute in s and (Ai, Bi) is a pair of common attributes.
Producing the result of Natural Join
  • For each tuple in relation r, compare common attributes with those in each tuple of s
  • If two tuples match in their common attributes then combine tuples, remove duplicate attributes and add to the result.
  • Example: From the example of Equi Join, assume that the attribute list in s now is ( E, F,B) instead of (E, F, G) then we can have the expression r * s .
Figure 20
  • Retrieve the information of student who enrols in at least one course.
Figure 21: Natural Join Operation Example - 2
  1. DIVISION Operation
The Divition Operation is defined on two relation r(U1) and s(U2) where U2 is the subset of U1 and s is not an empty relation: r÷s={ttr(U1U2)satisfy}r÷s={ttr(U1U2)satisfy} size 12{r div s= lbrace t \lline t in r \( U rSub { size 8{1} } - U rSub { size 8{2} } \) and ital "satisfy" rbrace } {} where satisfy=tss(trr(tr[U2]=tstr[U1U2]=t))satisfy=tss(trr(tr[U2]=tstr[U1U2]=t)) size 12{ ital "satisfy"= forall t rSub { size 8{s} } in s \( ` exists t rSub { size 8{r} } in r \( `t rSub { size 8{r} } \[ U rSub { size 8{2} } \] =t rSub { size 8{s} } ` and t rSub { size 8{r} } \[ U rSub { size 8{1} } - U rSub { size 8{2} } \] =t \) \) } {}
This means that for a tuple t to appear in the result of Division, the values in t must appear in r in combination with every tuple in s.
The Division is very useful for a special kind of query such as “ Retrieve the name of the student who enrolls in all course teach by Professor Ba”
  • Producing the result of the Division operation
    • Consider each subset of tuple in r that match on t[U1 – U2]
    • For this subset of tuples, take the values t[U2] from each. If this covers all tuples in s then add t[U1 – U2] in the result.
  • Example: Retrieve the name of subject that is taught in all courses
Figure 22: Division Operation Example

Data Manipulation with Relational Algebra Expression

Sample Database
In this session, we use the COMPANY database in the examples for illustrating the use of Relational Algebra for answering several queries.
The relational database schema for the COMPANY database is specified as below
EMPLOYEE (EID, Name, Bdate, Address, Salary, DeptId)
DEPARTMENT(DeptId, Dname, Office, Mng-EID)
PROJECT(Code, Name, Budget, DeptId)
JOIN(EID, PCode, StartDate)
EMP-DEPENDENT(EID, Dependent-Name, Bdate, Relationship)
Sample Queries
Query 1: Find all employees whose salary is greater than 30.000
σ salary > 30000 ( EMPLOYEE ) σ salary > 30000 ( EMPLOYEE ) size 12{σ rSub { size 8{ ital "salary"`>`"30000"} } \( ital "EMPLOYEE" \) } {}
Query 2: Find the name, address of employees who works for department number 1
π Name , Address ( σ DeptId = 1 ( EMPLOYEE ) ) π Name , Address ( σ DeptId = 1 ( EMPLOYEE ) ) size 12{π rSub { size 8{ ital "Name", ital "Address"} } \( σ rSub { size 8{ ital "DeptId"`=`1} } \( ital "EMPLOYEE"` \) \) } {}
Query 3: Find the name of the department that employee John Smith works for.
π Dname ( σ Name = ' John Smith ' ( EMPLOYEE DEPARTMENT ) ) π Dname ( σ Name = ' John Smith ' ( EMPLOYEE DEPARTMENT ) ) size 12{π rSub { size 8{ ital "Dname"} } \( σ rSub { size 8{ ital "Name"`=`' ital "John"` ital "Smith"'} } \( ital "EMPLOYEE"* ital "DEPARTMENT" \) \) } {}
Query 4: Find the EID, start date of the employees who works for project number P1 or project number P2
π EID , StartDate ( σ PCode = 1 PCode = 2 ( JOIN ) ) π EID , StartDate ( σ PCode = 1 PCode = 2 ( JOIN ) ) size 12{π rSub { size 8{ ital "EID", ital "StartDate"} } \( σ rSub { size 8{ ital "PCode"`=`1` or ` ital "PCode"`=`2} } \( ital "JOIN" \) \) } {}
Query 5: Find the name, relationship of all the dependents of employees who works for Department Human Resource
π Dependent Name ,Re lationship ( σ DName = ' Human Re source ' ( EMP DEPENDENT EMPLOYEE DEPARTMENT ) ) π Dependent Name ,Re lationship ( σ DName = ' Human Re source ' ( EMP DEPENDENT EMPLOYEE DEPARTMENT ) ) alignl { stack { size 12{π rSub { size 8{ ital "Dependent" - ital "Name"",Re" ital "lationship"} } \( σ rSub { size 8{ ital "DName"`=`' ital "Human"`"Re" ital "source"'} } } {} # \( ital "EMP" - ital "DEPENDENT"* ital "EMPLOYEE"* ital "DEPARTMENT" \) \) {} } } {}
Query 6: Finds the name of the employees who join in every project.
π Name ( EMPLOYEE ( π EID , PCode ( JOIN ) ÷ π Code ( PROJECT ) ) ) π Name ( EMPLOYEE ( π EID , PCode ( JOIN ) ÷ π Code ( PROJECT ) ) ) size 12{π rSub { size 8{ ital "Name"} } \( ital "EMPLOYEE"* \( π rSub { size 8{ ital "EID", ital "PCode"} } \( ital "JOIN" \) div π rSub { size 8{ ital "Code"} } \( ital "PROJECT" \) \) \) } {}
Query 7: Finds the names of the employees who have no dependent
π Name ( EMPLOYEE ( π EID ( EMPLOYEE ) EID ( EMP DEPENDENT ) ) ) π Name ( EMPLOYEE ( π EID ( EMPLOYEE ) EID ( EMP DEPENDENT ) ) ) size 12{π rSub { size 8{ ital "Name"} } \( ital "EMPLOYEE"* \( π rSub { size 8{ ital "EID"} } \( ital "EMPLOYEE" \) `\`π rSub { size 8{ ital "EID"} } \( ital "EMP" - ital "DEPENDENT" \) \) \) } {}
Query 8: Find the name of employees who works for both project number 1 and project number 2
π Name ( EMPLOYEE ( π EID ( σ PCode = 1 JOIN ) π EID ( σ PCode = 2 JOIN ) ) ) π Name ( EMPLOYEE ( π EID ( σ PCode = 1 JOIN ) π EID ( σ PCode = 2 JOIN ) ) ) size 12{π rSub { size 8{ ital "Name"} } ` \( ital "EMPLOYEE"`* \( π rSub { size 8{ ital "EID"} } \( σ rSub { size 8{ ital "PCode"`=`1`} } ital "JOIN" \) ` intersection `π rSub { size 8{ ital "EID"} } \( σ rSub { size 8{ ital "PCode"`=`2`} } ital "JOIN" \) \) ` \) } {}
Query 9: Find the name of the manager who has at least one dependent
π Name ( EMPLOYEE ( π EID ( EMP DEPENDENT EID = Mng EID DEPARTMENT ) ) ) π Name ( EMPLOYEE ( π EID ( EMP DEPENDENT EID = Mng EID DEPARTMENT ) ) ) size 12{π rSub { size 8{ ital "Name"} } ` \( ital "EMPLOYEE"`* \( π rSub { size 8{ ital "EID"} } \( ` ital "EMP" - ital "DEPENDENT" ⊳ ⊲ rSub { size 8{ ital "EID"`= ital "Mng" - ital "EID"} } ` ital "DEPARTMENT" \) \) \) } {}

Comments, questions, feedback, criticisms?

Send feedback