17  Inner Joins.

# load the packages that we'll need
if (!require(sqldf)) { install.packages("sqldf"); require(sqldf) }
if (!require(readr)) { install.packages("readr"); require(readr) }

17.1 The “books database”

In this and other sections we will be referring to the “books database”. Please see the section above that describes the “books database” for information about the database and how to download and install the files.

17.2 Intro to “JOIN”ing two tables

# read the data for the tables we'll be using
customers = read_csv("data/customers.csv", show_col_types=FALSE)
orders = read_csv("data/orders.csv", show_col_types=FALSE)

17.2.1 Overview of the customers and orders tables

# The examples in this page use the customers and orders tables.
# The data for those tables can be imported into R using the
# customers.csv and the orders.csv files.
# This code to read in the data from the csv files appears 
# further up in this file. 
#
# see this page for more info: https://www.w3schools.com/sql/sql_join.asp


#........................................................................
# Lets see a few rows from the customers table and from the orders table
#........................................................................

sqldf("select * from customers order by CustomerId limit 10")
   CustomerID                       CustomerName        ContactName                       Address        City PostalCode Country
1           1                Alfreds Futterkiste       Maria Anders                 Obere Str. 57      Berlin      12209 Germany
2           2 Ana Trujillo Emparedados y helados       Ana Trujillo Avda. de la Constitución 2222 México D.F.      05021  Mexico
3           3            Antonio Moreno Taquería     Antonio Moreno                Mataderos 2312 México D.F.      05023  Mexico
4           4                    Around the Horn       Thomas Hardy               120 Hanover Sq.      London    WA1 1DP      UK
5           5                 Berglunds snabbköp Christina Berglund                Berguvsvägen 8       Luleå   S-958 22  Sweden
6           6            Blauer See Delikatessen         Hanna Moos                Forsterstr. 57    Mannheim      68306 Germany
7           7               Blondel père et fils Frédérique Citeaux              24, place Kléber  Strasbourg      67000  France
8           8          Bólido Comidas preparadas      Martín Sommer                C/ Araquil, 67      Madrid      28023   Spain
9           9                           Bon app'  Laurence Lebihans          12, rue des Bouchers   Marseille      13008  France
10         10             Bottom-Dollar Marketse  Elizabeth Lincoln            23 Tsawassen Blvd.   Tsawassen    T2F 8M4  Canada
sqldf("select * from orders order by CustomerId limit 10")
   OrderID CustomerID EmployeeID  OrderDate ShipperID
1    10308          2          7  9/18/1996         3
2    10365          3          3 11/27/1996         2
3    10355          4          6 11/15/1996         1
4    10383          4          8 12/16/1996         3
5    10278          5          8  8/12/1996         2
6    10280          5          2  8/14/1996         1
7    10384          5          3 12/16/1996         3
8    10265          7          2  7/25/1996         1
9    10297          7          5   9/4/1996         2
10   10360          7          4 11/22/1996         3

17.3 “Primary Keys” and “Foreign Keys”

#------------------------------------------------
# Primary keys and Foreign keys
#------------------------------------------------

# - Primary Key
#
#   Many relational database tables have a column that contains a unique value
#   for every row in the table. For example the customers table in the example
#   below contains a CustomerId column in the customers table. Each row in the 
#   customers table contains a different value for the CustomerId. 
#   This allows you to identify a specific customer (i.e a specific row in the table)
#   by the customer id. For example CustomerId 100 corresponds to one and only one
#   row in the customers table.
#   The CustomerId column in the Customers table is known as a "primary key". 
#
#   Similarly, the orders table in the example below contains an OrderId
#   column. Each row in the orders table contains a unique value for the
#   OrderId. For example, orderId 10248 corresponds to exactly one row 
#   in the orders table. Therefore we say that the OrderId column is the 
#   "primary key" for the orders table.
#
#
#
# - Foreign Key
#
#   Note that the orders table ALSO contains a CustomerId column.
#   The customerId in the orders table indicates which customer
#   placed a specific order. To find the name of the customer, the address
#   of the customer etc. you can look in the customers table for the row
#   that contains the specific customerId for the order.
#
#   Since a single customer can have many different orders,
#   the values of the CustomerId column in the orders table contains
#   repeats of the same CustomerId value (in different rows of the orders table). 
#   The customerId column in the orders table is NOT a primary key.
#   The customerId column in the orders table exists so that we
#   can look up the customer information for a particular order.
#   The CustomerId column in the orders table is known as a "foreign key".
#   In general a column in a table that contains information from
#   the primary key of a different table is known as a "foreign key".
#
# In summary
# - The CustomerId column in the customers table is a primary key
# - The OrderId column in the orders table is a primary key
# - The CustomerId column in the orders table is a foreign key
#

17.3.1 More about the customers and orders tables

# show the orders table arranged in the order of the customerID

sqldf("select *
       from orders
       order by CustomerID")
    OrderID CustomerID EmployeeID  OrderDate ShipperID
1     10308          2          7  9/18/1996         3
2     10365          3          3 11/27/1996         2
3     10355          4          6 11/15/1996         1
4     10383          4          8 12/16/1996         3
5     10278          5          8  8/12/1996         2
6     10280          5          2  8/14/1996         1
7     10384          5          3 12/16/1996         3
8     10265          7          2  7/25/1996         1
9     10297          7          5   9/4/1996         2
10    10360          7          4 11/22/1996         3
11    10436          7          3   2/5/1997         2
12    10326          8          4 10/10/1996         2
13    10331          9          9 10/16/1996         1
14    10340          9          1 10/29/1996         3
15    10362          9          3 11/25/1996         1
16    10389         10          4 12/20/1996         2
17    10410         10          3  1/10/1997         3
18    10411         10          9  1/10/1997         3
19    10431         10          4  1/30/1997         2
20    10289         11          7  8/26/1996         3
21    10259         13          4  7/18/1996         3
22    10254         14          5  7/11/1996         2
23    10370         14          6  12/3/1996         2
24    10290         15          8  8/27/1996         1
25    10435         16          8   2/4/1997         2
26    10363         17          4 11/26/1996         3
27    10391         17          3 12/23/1996         3
28    10311         18          1  9/20/1996         3
29    10364         19          1 11/26/1996         1
30    10400         19          1   1/1/1997         3
31    10258         20          1  7/17/1996         1
32    10263         20          9  7/23/1996         3
33    10351         20          1 11/11/1996         1
34    10368         20          2 11/29/1996         2
35    10382         20          4 12/13/1996         1
36    10390         20          6 12/23/1996         1
37    10402         20          8   1/2/1997         2
38    10403         20          4   1/3/1997         3
39    10430         20          4  1/30/1997         1
40    10442         20          3  2/11/1997         2
41    10347         21          4  11/6/1996         3
42    10386         21          9 12/18/1996         3
43    10414         21          2  1/14/1997         3
44    10408         23          8   1/8/1997         1
45    10264         24          6  7/24/1996         3
46    10327         24          2 10/11/1996         1
47    10378         24          5 12/10/1996         3
48    10434         24          3   2/3/1997         2
49    10267         25          4  7/29/1996         1
50    10337         25          4 10/24/1996         3
51    10342         25          4 10/30/1996         2
52    10396         25          1 12/27/1996         3
53    10422         27          2  1/22/1997         1
54    10328         28          4 10/14/1996         3
55    10352         28          3 11/12/1996         3
56    10366         29          8 11/28/1996         2
57    10426         29          4  1/27/1997         1
58    10303         30          7  9/11/1996         2
59    10423         31          6  1/23/1997         3
60    10268         33          8  7/30/1996         3
61    10250         34          4   7/8/1996         2
62    10253         34          3  7/10/1996         2
63    10257         35          4  7/16/1996         3
64    10395         35          6 12/26/1996         1
65    10375         36          3  12/6/1996         2
66    10394         36          1 12/25/1996         3
67    10415         36          3  1/15/1997         1
68    10298         37          6   9/5/1996         2
69    10309         37          3  9/19/1996         1
70    10335         37          7 10/22/1996         2
71    10373         37          4  12/5/1996         3
72    10380         37          8 12/12/1996         3
73    10429         37          3  1/29/1997         2
74    10315         38          4  9/26/1996         2
75    10318         38          8  10/1/1996         2
76    10321         38          3  10/3/1996         2
77    10323         39          4  10/7/1996         1
78    10325         39          1  10/9/1996         3
79    10350         41          6 11/11/1996         2
80    10358         41          5 11/20/1996         1
81    10371         41          1  12/3/1996         1
82    10413         41          3  1/14/1997         2
83    10425         41          6  1/24/1997         2
84    10279         44          8  8/13/1996         2
85    10284         44          4  8/19/1996         1
86    10343         44          4 10/31/1996         1
87    10283         46          3  8/16/1996         3
88    10296         46          6   9/3/1996         1
89    10330         46          3 10/16/1996         1
90    10357         46          1 11/19/1996         3
91    10381         46          3 12/12/1996         3
92    10405         47          1   1/6/1997         1
93    10307         48          2  9/17/1996         2
94    10317         48          6  9/30/1996         1
95    10275         49          1   8/7/1996         1
96    10300         49          2   9/9/1996         2
97    10404         49          2   1/3/1997         1
98    10332         51          3 10/17/1996         2
99    10339         51          2 10/28/1996         2
100   10376         51          1  12/9/1996         2
101   10424         51          7  1/23/1997         2
102   10439         51          6   2/7/1997         3
103   10277         52          2   8/9/1996         3
104   10409         54          3   1/9/1997         1
105   10260         55          4  7/19/1996         1
106   10305         55          8  9/13/1996         3
107   10338         55          4 10/25/1996         3
108   10441         55          3  2/10/1997         2
109   10407         56          2   1/7/1997         2
110   10322         58          7  10/4/1996         3
111   10354         58          8 11/14/1996         3
112   10353         59          7 11/13/1996         3
113   10392         59          2 12/24/1996         3
114   10427         59          4  1/27/1997         2
115   10336         60          7 10/23/1996         2
116   10397         60          5 12/27/1996         1
117   10433         60          3   2/3/1997         3
118   10261         61          4  7/19/1996         2
119   10291         61          6  8/27/1996         2
120   10379         61          2 12/11/1996         1
121   10421         61          8  1/21/1997         1
122   10372         62          5  12/4/1996         2
123   10406         62          7   1/7/1997         1
124   10273         63          3   8/5/1996         3
125   10285         63          1  8/20/1996         2
126   10286         63          8  8/21/1996         3
127   10313         63          2  9/24/1996         2
128   10345         63          2  11/4/1996         2
129   10361         63          1 11/22/1996         2
130   10418         63          4  1/17/1997         1
131   10262         65          8  7/22/1996         3
132   10272         65          6   8/2/1996         2
133   10294         65          4  8/30/1996         2
134   10314         65          1  9/25/1996         2
135   10316         65          1  9/27/1996         3
136   10346         65          3  11/5/1996         3
137   10401         65          1   1/1/1997         1
138   10288         66          4  8/23/1996         1
139   10428         66          7  1/28/1997         1
140   10443         66          8  2/12/1997         1
141   10287         67          8  8/22/1996         3
142   10299         67          4   9/6/1996         2
143   10255         68          9  7/12/1996         3
144   10419         68          4  1/20/1997         2
145   10281         69          4  8/14/1996         1
146   10282         69          4  8/15/1996         1
147   10306         69          1  9/16/1996         3
148   10387         70          1 12/18/1996         2
149   10324         71          9  10/8/1996         1
150   10393         71          1 12/25/1996         3
151   10398         71          2 12/30/1996         3
152   10440         71          4  2/10/1997         2
153   10359         72          5 11/21/1996         3
154   10377         72          1  12/9/1996         3
155   10388         72          2 12/19/1996         1
156   10341         73          7 10/29/1996         3
157   10417         73          4  1/16/1997         3
158   10271         75          6   8/1/1996         2
159   10329         75          4 10/15/1996         2
160   10349         75          7  11/8/1996         1
161   10369         75          8  12/2/1996         2
162   10385         75          1 12/17/1996         2
163   10432         75          3  1/31/1997         2
164   10252         76          4   7/9/1996         2
165   10302         76          4  9/10/1996         2
166   10310         77          8  9/20/1996         2
167   10438         79          3   2/6/1997         2
168   10276         80          8   8/8/1996         3
169   10293         80          1  8/29/1996         3
170   10304         80          1  9/12/1996         2
171   10319         80          7  10/2/1996         3
172   10249         81          6   7/5/1996         1
173   10292         81          1  8/28/1996         2
174   10367         83          7 11/28/1996         3
175   10399         83          8 12/31/1996         3
176   10251         84          3   7/8/1996         1
177   10334         84          8 10/21/1996         2
178   10274         85          6   8/6/1996         1
179   10295         85          2   9/2/1996         2
180   10301         86          8   9/9/1996         2
181   10312         86          2  9/23/1996         2
182   10348         86          4  11/7/1996         2
183   10356         86          6 11/18/1996         2
184   10266         87          3  7/26/1996         3
185   10270         87          1   8/1/1996         1
186   10320         87          5  10/3/1996         3
187   10333         87          5 10/18/1996         3
188   10412         87          8  1/13/1997         2
189   10416         87          8  1/16/1997         3
190   10437         87          8   2/5/1997         1
191   10256         88          3  7/15/1996         2
192   10420         88          3  1/21/1997         1
193   10269         89          5  7/31/1996         1
194   10344         89          4  11/1/1996         2
195   10248         90          5   7/4/1996         3
196   10374         91          1  12/5/1996         3
# Show customer info for customer id 10

sqldf("select * from customers where CustomerId = 10")
  CustomerID           CustomerName       ContactName            Address      City PostalCode Country
1         10 Bottom-Dollar Marketse Elizabeth Lincoln 23 Tsawassen Blvd. Tsawassen    T2F 8M4  Canada
# Show all orders for customer id 10

sqldf("select * from orders where CustomerId = 10")
  OrderID CustomerID EmployeeID  OrderDate ShipperID
1   10389         10          4 12/20/1996         2
2   10410         10          3  1/10/1997         3
3   10411         10          9  1/10/1997         3
4   10431         10          4  1/30/1997         2

17.3.2 JOINing customers and orders into a single output

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# using the JOIN to combine information from two tables into a single output
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


# Combine information from the customers and the orders table.
#
# Specifically, show the following data
# - customer name and country from the customers table
# - orderId, OrderDate from the orders table
#
# Only show data for all orders placed by customer 10

sqldf("select OrderID, customers.CustomerId, CustomerName, Country, OrderDate
       from orders join customers on orders.customerId = customers.CustomerId
       where customers.CustomerId = 10")   # ame result if you say orders.customerId
  OrderID CustomerID           CustomerName Country  OrderDate
1   10389         10 Bottom-Dollar Marketse  Canada 12/20/1996
2   10410         10 Bottom-Dollar Marketse  Canada  1/10/1997
3   10411         10 Bottom-Dollar Marketse  Canada  1/10/1997
4   10431         10 Bottom-Dollar Marketse  Canada  1/30/1997
# do the same for customerids 1 through 5

sql="select OrderID, customers.CustomerId, CustomerName, OrderDate
     from orders join customers on orders.customerId = customers.CustomerId
     where customers.CustomerId >= 1 and customers.CustomerId <= 5" 
sqldf(sql)
  OrderID CustomerID                       CustomerName  OrderDate
1   10308          2 Ana Trujillo Emparedados y helados  9/18/1996
2   10365          3            Antonio Moreno Taquería 11/27/1996
3   10355          4                    Around the Horn 11/15/1996
4   10383          4                    Around the Horn 12/16/1996
5   10278          5                 Berglunds snabbköp  8/12/1996
6   10280          5                 Berglunds snabbköp  8/14/1996
7   10384          5                 Berglunds snabbköp 12/16/1996
# another way, with the IN operator
sql="select OrderID, customers.CustomerId, CustomerName, OrderDate
     from orders join customers on orders.customerId = customers.CustomerId
     where customers.CustomerId IN (1,2,3,4,5)"   # you would get the same result if you said orders.customerId
sqldf(sql)
  OrderID CustomerID                       CustomerName  OrderDate
1   10308          2 Ana Trujillo Emparedados y helados  9/18/1996
2   10365          3            Antonio Moreno Taquería 11/27/1996
3   10355          4                    Around the Horn 11/15/1996
4   10383          4                    Around the Horn 12/16/1996
5   10278          5                 Berglunds snabbköp  8/12/1996
6   10280          5                 Berglunds snabbköp  8/14/1996
7   10384          5                 Berglunds snabbköp 12/16/1996
# show the orders that were shipped by shipper #3 for customers who
# are from Mexico. Sort the results by the orderId
sql="select orderId, customerName, customers.customerId, shipperId, country
     from orders join customers on orders.customerId = customers.customerId
     where shipperId = 3 and country = 'Mexico'
     order by orderId"
sqldf(sql)
  OrderID                       CustomerName CustomerID ShipperID Country
1   10259         Centro comercial Moctezuma         13         3  Mexico
2   10276                Tortuga Restaurante         80         3  Mexico
3   10293                Tortuga Restaurante         80         3  Mexico
4   10308 Ana Trujillo Emparedados y helados          2         3  Mexico
5   10319                Tortuga Restaurante         80         3  Mexico
6   10322          Pericles Comidas clásicas         58         3  Mexico
7   10354          Pericles Comidas clásicas         58         3  Mexico

17.4 One to Many relationships

#####################################################################
#
# One to Many relationships
#
# EXAMPLE:
#           a single SPECIFIC customer may have many orders
#           a single SPECIFIC order is related to exactly one customer 
#
# IN GENERAL:
#
# A single SPECIFIC row in table A may be related to many (or possibly zero)
# rows from table B. 
#
# However a single SPECIFIC row from table B is related to 
# EXACTLY ONE row from table A
#
# We'll call table A the "one table" and 
#            table B the "many table"
#
# - The "one" table has a primary key.
#
# - The "many" table has a foreign key that contains values of the
#   other table's primary key values.
#####################################################################

17.5 Many to Many relationships

#####################################################################
#
# Many to Many relationships
#
# EXAMPLE:
#           a single SPECIFIC author may have written many titles
#           a single SPECIFIC book   may have many authors
#
#
# Many to many relationships cannot be directly expressed in relational
# databases. 
# 
# To establish a many to many relationship, a "middle" table is created
# that contains the to foreign keys. Each of the foreign keys relate 
# back to the primary key of one of the two tables. 
# In the books database, the middle table is called title_authors.
#####################################################################

17.6 Practice - “one to many” or “many to many”?

###################################
# Practice identifying relationships
#
# - one to many
#
# - many to many
#
###################################

# QUESTION 
#
# A university uses to tables to track their real estate.
#
# The BUILDINGS table contains one row for each building they own.
# 
# The ROOMS table contains one row for each room in each building.
#
# Are the BUILDINGS and ROOMS tables in a "1 to many" or a 
# "many to many" relationship?
#
# Given the following definitions of a BUILDINGS table and 
# a ROOMS table draw a "crows foot" Entity Relationship Diagram.
# Make sure to show the symbols for the maximum and minimum 
# cardinalities. Explain in English what each symbol on your
# diagram means. 
#------------------------------------------------------------------ 
#    BUILDINGS TABLE
#
#    bldgCode    (PK)
#    streetAddress
#    city
#    state
#    zip
#    numOfFloors
#
#
#    ROOMS TABLE
# 
#    bldgCode (PK)
#    roomNum  (PK)
#    capacity
#    type    (classroom, dorm room, office)
#    hasVideoProjection (TRUE/FALSE)
#
#########################################################################



#########################################################################
# 
# The same university keeps track of their professors, the courses 
# that are in the course catalog and the specific offerings every
# semester. Think about how these concepts are related to each other.
#
# How are PROFESSORS related to COURSES? "one to many", "many to one", "many to many"?
#
# How are PROFESSORS related to OFFERINGS? "one to many", "many to one", "many to many"?
#
# How are COURSES related to OFFERINGS? "one to many", "many to one", "many to many"?
#
# The following tables could be used to track this info.
# Draw the crows foot Entity Relationship Diagram.
# Make sure to show the symbols for the maximum and minimum 
# cardinalities. Explain in English what each symbol on your
# diagram means. 
#------------------------------------------------------------------------
#
# PROFESSORS TABLE
#
# Description: this table contains exactly one row for each professor that is
# employed by the university. 
# 
# facultyId (PK)
# firstName
# lastName
# department
# salary
# yearHired
#
#
#
# OFFERINGS TABLE
#
# Description: this table contains info about the sections that are 
# being offered in particular semesters.
#
# crn (PK)
# courseNumber (FK1)
# semester
# year
# room
# buildingId
# facutyId (FK2)
#
#
#
# COURSES TABLE 
# Description: This table contains one row for each course in the course catalog.
# Note that not every course is offered every semester. Some courses have
# multiple offerings in a single semester.
#
# courseNumber (PK)
# courseName
# credits
# description
#############################################################################

17.7 Using JOIN with the books database

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The books database contains information about various books, authors,
# publishers, etc.
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# Read in the data for the books database - see the 
titles = read_csv("data/booksDatabase/titles.csv", na="NULL", show_col_types=FALSE)
authors = read_csv("data/booksDatabase/authors.csv", na="NULL", show_col_types=FALSE)
publishers = read_csv("data/booksDatabase/publishers.csv", na="NULL", show_col_types=FALSE)
title_authors = read_csv("data/booksDatabase/title_authors.csv", na="NULL", show_col_types=FALSE)
royalties = read_csv("data/booksDatabase/royalties.csv", na="NULL", show_col_types=FALSE)


# QUESTION: Write a SQL select statement (i.e. a "query") to display the following data.
# For each title, show the 
# - title name
# - num pages
# - publisher name
# - publisher country

# ANSWER

sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id"
sqldf(sql)
                            title_name pages             pub_name country
1                                1977!   107    Abatis Publishers     USA
2            But I Did It Unconciously   510    Abatis Publishers     USA
3               Exchange of Platitudes   201    Abatis Publishers     USA
4                     How About Never?   473    Abatis Publishers     USA
5         Just Wait Until After School    86    Abatis Publishers     USA
6                      Kiss My Boo Boo    22    Abatis Publishers     USA
7         Ask Yor System Administrator  1226      Core Dump Books     USA
8            200 Years of German Humor    14 Schandenfreude Press Germany
9                    I Blame My Mother   333 Schandenfreude Press Germany
10 What Are The Civilian Applications?   802 Schandenfreude Press Germany
11         Not Without My Fabrerge Egg    NA       AAA Publishing     USA
12    Perhaps It's a Glandular Problem   826       AAA Publishing     USA
13           Spontaneous, Not Annoying   507       AAA Publishing     USA
# To be more clear where each column comes from you can
# write the columns in the format tableName.columnName.
# (In this example it's not necessary to do so - but can make your code easier
# to understand. Later we'll see examples where you must specify the 
# tablename.colname format).

sql = "select titles.title_name, titles.pages, publishers.pub_name, publishers.country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id"
sqldf(sql)
                            title_name pages             pub_name country
1                                1977!   107    Abatis Publishers     USA
2            But I Did It Unconciously   510    Abatis Publishers     USA
3               Exchange of Platitudes   201    Abatis Publishers     USA
4                     How About Never?   473    Abatis Publishers     USA
5         Just Wait Until After School    86    Abatis Publishers     USA
6                      Kiss My Boo Boo    22    Abatis Publishers     USA
7         Ask Yor System Administrator  1226      Core Dump Books     USA
8            200 Years of German Humor    14 Schandenfreude Press Germany
9                    I Blame My Mother   333 Schandenfreude Press Germany
10 What Are The Civilian Applications?   802 Schandenfreude Press Germany
11         Not Without My Fabrerge Egg    NA       AAA Publishing     USA
12    Perhaps It's a Glandular Problem   826       AAA Publishing     USA
13           Spontaneous, Not Annoying   507       AAA Publishing     USA
See next section about “CROSS JOINS”

Before going further, please take a few minutes and read the information on “CROSS JOIN” that is found in the next section of this website. The discussion of CROSS JOINS will be moved to this location soon…

# DEEPER EXPLANATION
#
# We learned earlier about a "CROSS JOIN" that takes the rows of two different
# tables and combines them in every way. This is useful when you want to see
# every possible combination of the rows of two different tables. For example,
# if you have a table of shirts and a table of pants and you want to see every
# possible combination of a shirt with a pair of pants, a cross join might
# be appropriate. However, most of the time that is not what you want to do.
# Most of the time, the appropropriate type of join is known a an "INNER JOIN"
# (or just simply "JOIN", the word inner is optional).
# An inner join matches up those rows from two
# different tables that share a primary key and a foreign key with the same
# values.
#
# However, a cross join is actually at the heart of how the database 
# performs an inner join internally. Specifically whenever the SQL code
# specifies an inner join (i.e. "join") the database first performs a
# cross join and then removes the rows from the cross join 
# results that do not match the ON clause of the INNER JOIN.
#
# Let's look at that in a little more detail ...
#
# First thing that happens when the above query is processed internally
# is the following cross join
sql = "select titles.*, publishers.*
       from titles cross join publishers"
sqldf(sql)
   title_id                          title_name       type pub_id pages price   sales         pubdate pub_id             pub_name          city state country
1       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P01    Abatis Publishers      New York    NY     USA
2       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
3       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
4       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
5       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
6       T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P01    Abatis Publishers      New York    NY     USA
7       T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P02      Core Dump Books San Francisco    CA     USA
8       T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
9       T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
10      T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P05       AAA Publishing      Berkeley    CA     USA
11      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P01    Abatis Publishers      New York    NY     USA
12      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
13      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
14      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
15      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
16      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
17      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P02      Core Dump Books San Francisco    CA     USA
18      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
19      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
20      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P05       AAA Publishing      Berkeley    CA     USA
21      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
22      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P02      Core Dump Books San Francisco    CA     USA
23      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
24      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
25      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P05       AAA Publishing      Berkeley    CA     USA
26      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P01    Abatis Publishers      New York    NY     USA
27      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
28      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
29      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
30      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
31      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P01    Abatis Publishers      New York    NY     USA
32      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P02      Core Dump Books San Francisco    CA     USA
33      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
34      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
35      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P05       AAA Publishing      Berkeley    CA     USA
36      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
37      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P02      Core Dump Books San Francisco    CA     USA
38      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
39      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
40      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P05       AAA Publishing      Berkeley    CA     USA
41      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P01    Abatis Publishers      New York    NY     USA
42      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P02      Core Dump Books San Francisco    CA     USA
43      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
44      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
45      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P05       AAA Publishing      Berkeley    CA     USA
46      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P01    Abatis Publishers      New York    NY     USA
47      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P02      Core Dump Books San Francisco    CA     USA
48      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P03 Schandenfreude Press       Hamburg  <NA> Germany
49      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P04    Tneterhooks Press      Berkeley    CA     USA
50      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P05       AAA Publishing      Berkeley    CA     USA
51      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P01    Abatis Publishers      New York    NY     USA
52      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
53      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
54      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
55      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
56      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P01    Abatis Publishers      New York    NY     USA
57      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
58      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
59      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
60      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
61      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
62      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P02      Core Dump Books San Francisco    CA     USA
63      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
64      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P04    Tneterhooks Press      Berkeley    CA     USA
65      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P05       AAA Publishing      Berkeley    CA     USA
# after that , the rows that don't match the "on" clause in the inner join
# are removed from the results. This can be seen by adding a where clause 
# to the cross join shown above.

sql = "select titles.*, publishers.*
       from titles cross join publishers
       where titles.pub_id = publishers.pub_id"
sqldf(sql)
   title_id                          title_name       type pub_id pages price   sales         pubdate pub_id             pub_name          city state country
1       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P01    Abatis Publishers      New York    NY     USA
2       T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
3       T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
4       T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
5       T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
6       T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P01    Abatis Publishers      New York    NY     USA
7       T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
8       T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
9       T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P01    Abatis Publishers      New York    NY     USA
10      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P05       AAA Publishing      Berkeley    CA     USA
11      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
12      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
13      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
# The last thing that happens is that just the columns that appear in 
# the select clause are displayed. Therefore, the following two 
# queries are equivalent

# This is the inner join version (we copied this from above)
sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id"
sqldf(sql)
                            title_name pages             pub_name country
1                                1977!   107    Abatis Publishers     USA
2            But I Did It Unconciously   510    Abatis Publishers     USA
3               Exchange of Platitudes   201    Abatis Publishers     USA
4                     How About Never?   473    Abatis Publishers     USA
5         Just Wait Until After School    86    Abatis Publishers     USA
6                      Kiss My Boo Boo    22    Abatis Publishers     USA
7         Ask Yor System Administrator  1226      Core Dump Books     USA
8            200 Years of German Humor    14 Schandenfreude Press Germany
9                    I Blame My Mother   333 Schandenfreude Press Germany
10 What Are The Civilian Applications?   802 Schandenfreude Press Germany
11         Not Without My Fabrerge Egg    NA       AAA Publishing     USA
12    Perhaps It's a Glandular Problem   826       AAA Publishing     USA
13           Spontaneous, Not Annoying   507       AAA Publishing     USA
# Version with a cross join - this is what happens internally when
# you write an inner join (i.e. "join")
sql = "select title_name, pages, pub_name, country
       from titles cross join publishers
       where titles.pub_id = publishers.pub_id"
sqldf(sql)
                            title_name pages             pub_name country
1                                1977!   107    Abatis Publishers     USA
2            200 Years of German Humor    14 Schandenfreude Press Germany
3         Ask Yor System Administrator  1226      Core Dump Books     USA
4            But I Did It Unconciously   510    Abatis Publishers     USA
5               Exchange of Platitudes   201    Abatis Publishers     USA
6                     How About Never?   473    Abatis Publishers     USA
7                    I Blame My Mother   333 Schandenfreude Press Germany
8         Just Wait Until After School    86    Abatis Publishers     USA
9                      Kiss My Boo Boo    22    Abatis Publishers     USA
10         Not Without My Fabrerge Egg    NA       AAA Publishing     USA
11    Perhaps It's a Glandular Problem   826       AAA Publishing     USA
12           Spontaneous, Not Annoying   507       AAA Publishing     USA
13 What Are The Civilian Applications?   802 Schandenfreude Press Germany
# QUESTION
# Only show results for books that have at least 100 pages

sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where pages >= 100"
sqldf(sql)
                           title_name pages             pub_name country
1                               1977!   107    Abatis Publishers     USA
2        Ask Yor System Administrator  1226      Core Dump Books     USA
3           But I Did It Unconciously   510    Abatis Publishers     USA
4              Exchange of Platitudes   201    Abatis Publishers     USA
5                    How About Never?   473    Abatis Publishers     USA
6                   I Blame My Mother   333 Schandenfreude Press Germany
7    Perhaps It's a Glandular Problem   826       AAA Publishing     USA
8           Spontaneous, Not Annoying   507       AAA Publishing     USA
9 What Are The Civilian Applications?   802 Schandenfreude Press Germany
# QUESTION
# Only show books that are at least 100 pages and published in the USA

sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where pages >= 100 and country = 'USA'"
sqldf(sql)
                        title_name pages          pub_name country
1                            1977!   107 Abatis Publishers     USA
2        But I Did It Unconciously   510 Abatis Publishers     USA
3           Exchange of Platitudes   201 Abatis Publishers     USA
4                 How About Never?   473 Abatis Publishers     USA
5     Ask Yor System Administrator  1226   Core Dump Books     USA
6 Perhaps It's a Glandular Problem   826    AAA Publishing     USA
7        Spontaneous, Not Annoying   507    AAA Publishing     USA
# QUESTION
# Show books that are shorter than 100 pages  and books that are at least 1000 pages
# Do not show any other books.
# Only show books from the USA.

sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where (pages < 100 or pages >= 1000) and country = 'USA' "
sqldf(sql)
                    title_name pages          pub_name country
1 Just Wait Until After School    86 Abatis Publishers     USA
2              Kiss My Boo Boo    22 Abatis Publishers     USA
3 Ask Yor System Administrator  1226   Core Dump Books     USA
# QUESTION
# show the results in the order of the length of the book (i.e. # of pages)
# show the shortest books at the top

sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where (pages < 100 or pages >= 1000) and country = 'USA' 
       order by pages asc"     # asc is optional - you don't have to write it - it is the default
sqldf(sql)
                    title_name pages          pub_name country
1              Kiss My Boo Boo    22 Abatis Publishers     USA
2 Just Wait Until After School    86 Abatis Publishers     USA
3 Ask Yor System Administrator  1226   Core Dump Books     USA
# QUESTION
# do the same thing but put the largest books at the top

sql = "select title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where (pages < 100 or pages >= 1000) and country = 'USA' 
       order by pages desc"     # you must write desc (i.e. descending)
sqldf(sql)
                    title_name pages          pub_name country
1 Ask Yor System Administrator  1226   Core Dump Books     USA
2 Just Wait Until After School    86 Abatis Publishers     USA
3              Kiss My Boo Boo    22 Abatis Publishers     USA
# QUESTION - add the pub_id to the output of the previous query.


# ANSWER
#
# Because the pub_id exists in both the publishers and the titles
# tables, every time that pub_id is used in the query, you must also include
# the name of the table.
#
# For more info, see the comment below entitled
#     #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#     # select table.column ...        vs       select column ... 
#     #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

sql = "select publishers.pub_id, title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where (pages < 100 or pages >= 1000) and country = 'USA' 
       order by pages desc"     
sqldf(sql)
  pub_id                   title_name pages          pub_name country
1    P02 Ask Yor System Administrator  1226   Core Dump Books     USA
2    P01 Just Wait Until After School    86 Abatis Publishers     USA
3    P01              Kiss My Boo Boo    22 Abatis Publishers     USA
# it will also work if you write titles.pub_id because once the table have been
# joined together, the row from the publishers table and the corresponding
# row from the titles table both have the same value for the pub_id
#
# For more info, see the comment below entitled
#
#     #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#     # select table.column ...        vs       select column ... 
#     #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#

sql = "select titles.pub_id, title_name, pages, pub_name, country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where (pages < 100 or pages >= 1000) and country = 'USA' 
       order by pages desc"     
 sqldf(sql)
  pub_id                   title_name pages          pub_name country
1    P02 Ask Yor System Administrator  1226   Core Dump Books     USA
2    P01 Just Wait Until After School    86 Abatis Publishers     USA
3    P01              Kiss My Boo Boo    22 Abatis Publishers     USA
# You could include the table names even for columns that don't rquire it.
sql = "select titles.pub_id, titles.title_name, titles.pages, publishers.pub_name, publishers.country
       from publishers  join   titles  on  publishers.pub_id = titles.pub_id
       where (pages < 100 or pages >= 1000) and country = 'USA' 
       order by pages desc"     
sqldf(sql)
  pub_id                   title_name pages          pub_name country
1    P02 Ask Yor System Administrator  1226   Core Dump Books     USA
2    P01 Just Wait Until After School    86 Abatis Publishers     USA
3    P01              Kiss My Boo Boo    22 Abatis Publishers     USA
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# select table.column ...        vs       select column ... 
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# You must include table names in the SELECT clause for columns that 
# appear in more than one table. Column names that that only appear 
# in a single table in the query do NOT need to be preceded by the table name. 
#
# However, it is PERMISSIBLE to include the table name for ANY column (even
# if the column only appears in a single table).
# Some people prefer to include the table name even when it 
# is not necessary (as done above) to make it clearer to someone reading the 
# code which table contains which columns.
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# QUESTION
#
# Show the title_id, the title, and the author ID
# for each author that had a part in writing a book. 
#
# If there is more than one author for a title then show a different row
# in the output for each author.
#
# Show the rows in alphabetical order by title.

# ANSWER

sql = "select titles.title_id, title_name, au_id
       from titles join title_authors   on    titles.title_id = title_authors.title_id
       order by title_name"
sqldf(sql)
   title_id                          title_name au_id
1       T01                               1977!   A01
2       T02           200 Years of German Humor   A01
3       T03        Ask Yor System Administrator   A05
4       T04           But I Did It Unconciously   A03
5       T04           But I Did It Unconciously   A04
6       T05              Exchange of Platitudes   A04
7       T06                    How About Never?   A02
8       T07                   I Blame My Mother   A02
9       T07                   I Blame My Mother   A04
10      T08        Just Wait Until After School   A06
11      T09                     Kiss My Boo Boo   A06
12      T10         Not Without My Fabrerge Egg   A02
13      T11    Perhaps It's a Glandular Problem   A03
14      T11    Perhaps It's a Glandular Problem   A04
15      T11    Perhaps It's a Glandular Problem   A06
16      T12           Spontaneous, Not Annoying   A02
17      T13 What Are The Civilian Applications?   A01

17.8 Joining three or more tables

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Joining three or more tables
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# For many queries you will need to join more than two tables together.
# For example, suppose we wanted to list the titles and the author names
# for each title.
#
# The title_name column is in the titles table. 
# The au_fname and au_lname columns are in the authors tables.
#
# Therefore, you will obviously need to include the titles and authors
# tables in the select statement. However, the only way to figure
# out which authors wrote which titles is to 
# also look at the title_authors table. Therefore the title_authors
# table must also be part of the query.
#
# The best way to understand which tables must be included in a query
# is to look at the ERD (Entity Relationship Diagram) for the database.
# If you need data from two different tables that are not directly 
# connected to each other on the diagram, you will also need to join all of the
# tables that lie BETWEEN those tables also.
#
# To join multiple tables together, look at the diagram and find "path" on the
# diagram starting with one table that you need data from and ending with 
# another table that you need data from. 
#
# For example, if you want to show
# the author names and the title names, you can start with the authors
# table, then join onto that the title_authors table, then finally join 
# the authors table.
# 
# Alternatively, you could start with the titles table, then join onto
# that the title_authors table and then finally join the authors table.
# It doesn't make a difference which table you start with and which table you
# end with, as long as you join all of the tables in the path of tables.
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


# QUESTION
# Show the the same info as the previous query but instead of the au_id,
# show the authors first and last names.

sql = "select titles.title_id, title_name, authors.au_id, au_fname, au_lname
       from titles join title_authors   on    titles.title_id = title_authors.title_id
                   join authors         on    authors.au_id   = title_authors.au_id
       order by titles.title_id"
sqldf(sql)
   title_id                          title_name au_id  au_fname  au_lname
1       T01                               1977!   A01     Sarah   Buchman
2       T02           200 Years of German Humor   A01     Sarah   Buchman
3       T03        Ask Yor System Administrator   A05 Christian     Kells
4       T04           But I Did It Unconciously   A03    Hallie      Hull
5       T04           But I Did It Unconciously   A04      Klee      Hull
6       T05              Exchange of Platitudes   A04      Klee      Hull
7       T06                    How About Never?   A02     Wendy Heydemark
8       T07                   I Blame My Mother   A02     Wendy Heydemark
9       T07                   I Blame My Mother   A04      Klee      Hull
10      T08        Just Wait Until After School   A06    Harvey   Kellsey
11      T09                     Kiss My Boo Boo   A06    Harvey   Kellsey
12      T10         Not Without My Fabrerge Egg   A02     Wendy Heydemark
13      T11    Perhaps It's a Glandular Problem   A03    Hallie      Hull
14      T11    Perhaps It's a Glandular Problem   A04      Klee      Hull
15      T11    Perhaps It's a Glandular Problem   A06    Harvey   Kellsey
16      T12           Spontaneous, Not Annoying   A02     Wendy Heydemark
17      T13 What Are The Civilian Applications?   A01     Sarah   Buchman
# QUESTION 
# Same as previous question but show the au_id in addition to the authors' names.

# ANSWER
# The only extra issue in this question is that since the au_id column 
# appears in more than one table in the query (i.e. it's in titles
# and in the title_authors tables), You cannot simply type au_id in 
# the select clause - that would be an ambiguous column name. 
# Rather you must write either authors.au_id or title_authors.au_id.
# Either one authors.au_id or title_authors.au_id will yield the exact
# same results because after the "on" clauses got finished processing
# the only rows left from the cross join are rows in which the
# authors.au_id and title_authors.au_id columns contain the same values.
# or title_authors.au_id. You 

# WRONG ... the au_id column is ambiguous - you need to specify the table name
sql = "select titles.title_id, title_name, au_id, au_fname, au_lname
       from titles join title_authors   on    titles.title_id = title_authors.title_id
                   join authors         on    authors.au_id   = title_authors.au_id
       order by titles.title_id"
sqldf(sql)
Error: ambiguous column name: au_id
# This works (authors.au_id)
sql = "select titles.title_id, title_name, authors.au_id, au_fname, au_lname
       from titles join title_authors   on    titles.title_id = title_authors.title_id
                   join authors         on    authors.au_id   = title_authors.au_id
       order by titles.title_id"
sqldf(sql)
   title_id                          title_name au_id  au_fname  au_lname
1       T01                               1977!   A01     Sarah   Buchman
2       T02           200 Years of German Humor   A01     Sarah   Buchman
3       T03        Ask Yor System Administrator   A05 Christian     Kells
4       T04           But I Did It Unconciously   A03    Hallie      Hull
5       T04           But I Did It Unconciously   A04      Klee      Hull
6       T05              Exchange of Platitudes   A04      Klee      Hull
7       T06                    How About Never?   A02     Wendy Heydemark
8       T07                   I Blame My Mother   A02     Wendy Heydemark
9       T07                   I Blame My Mother   A04      Klee      Hull
10      T08        Just Wait Until After School   A06    Harvey   Kellsey
11      T09                     Kiss My Boo Boo   A06    Harvey   Kellsey
12      T10         Not Without My Fabrerge Egg   A02     Wendy Heydemark
13      T11    Perhaps It's a Glandular Problem   A03    Hallie      Hull
14      T11    Perhaps It's a Glandular Problem   A04      Klee      Hull
15      T11    Perhaps It's a Glandular Problem   A06    Harvey   Kellsey
16      T12           Spontaneous, Not Annoying   A02     Wendy Heydemark
17      T13 What Are The Civilian Applications?   A01     Sarah   Buchman
# This also works (title_authors.au_id)
sql = "select titles.title_id, title_name, title_authors.au_id, au_fname, au_lname
       from titles join title_authors   on    titles.title_id = title_authors.title_id
                   join authors         on    authors.au_id   = title_authors.au_id
       order by titles.title_id"
sqldf(sql)
   title_id                          title_name au_id  au_fname  au_lname
1       T01                               1977!   A01     Sarah   Buchman
2       T02           200 Years of German Humor   A01     Sarah   Buchman
3       T03        Ask Yor System Administrator   A05 Christian     Kells
4       T04           But I Did It Unconciously   A03    Hallie      Hull
5       T04           But I Did It Unconciously   A04      Klee      Hull
6       T05              Exchange of Platitudes   A04      Klee      Hull
7       T06                    How About Never?   A02     Wendy Heydemark
8       T07                   I Blame My Mother   A02     Wendy Heydemark
9       T07                   I Blame My Mother   A04      Klee      Hull
10      T08        Just Wait Until After School   A06    Harvey   Kellsey
11      T09                     Kiss My Boo Boo   A06    Harvey   Kellsey
12      T10         Not Without My Fabrerge Egg   A02     Wendy Heydemark
13      T11    Perhaps It's a Glandular Problem   A03    Hallie      Hull
14      T11    Perhaps It's a Glandular Problem   A04      Klee      Hull
15      T11    Perhaps It's a Glandular Problem   A06    Harvey   Kellsey
16      T12           Spontaneous, Not Annoying   A02     Wendy Heydemark
17      T13 What Are The Civilian Applications?   A01     Sarah   Buchman
# QUESTION
# 19.   List the publisher name, the title_name, the authors first and last names,
#     the author's state and the publishers state for all books.
#     Make sure that in the output the authors state and publishers state
#     columns are clearly labeled as such.
#     Sort the rows alphabetically by the title name. If a title has more than one 
#     author then there will be a different row in the output for each author.
#     Sort those rows alphabetically by the publisher name, then by the 
#     title and finally by the author's names.

# ANSWER
# This query requires joining 4 different tables.
# See the code for details.
# Notice that in the order by the au_lname comes before au_fname (this is in
# general how people's names are sorted alphabetically)

sql = "select pub_name, title_name, au_fname, au_lname,
              authors.state as 'author state', 
              publishers.state as 'publisher state'
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id
  order by pub_name, title_name, au_lname, au_fname"
sqldf(sql)
               pub_name                          title_name  au_fname  au_lname author state publisher state
1        AAA Publishing         Not Without My Fabrerge Egg     Wendy Heydemark           CO              CA
2        AAA Publishing    Perhaps It's a Glandular Problem    Hallie      Hull           CA              CA
3        AAA Publishing    Perhaps It's a Glandular Problem      Klee      Hull           CA              CA
4        AAA Publishing    Perhaps It's a Glandular Problem    Harvey   Kellsey           CA              CA
5        AAA Publishing           Spontaneous, Not Annoying     Wendy Heydemark           CO              CA
6     Abatis Publishers                               1977!     Sarah   Buchman           NY              NY
7     Abatis Publishers           But I Did It Unconciously    Hallie      Hull           CA              NY
8     Abatis Publishers           But I Did It Unconciously      Klee      Hull           CA              NY
9     Abatis Publishers              Exchange of Platitudes      Klee      Hull           CA              NY
10    Abatis Publishers                    How About Never?     Wendy Heydemark           CO              NY
11    Abatis Publishers        Just Wait Until After School    Harvey   Kellsey           CA              NY
12    Abatis Publishers                     Kiss My Boo Boo    Harvey   Kellsey           CA              NY
13      Core Dump Books        Ask Yor System Administrator Christian     Kells           NY              CA
14 Schandenfreude Press           200 Years of German Humor     Sarah   Buchman           NY            <NA>
15 Schandenfreude Press                   I Blame My Mother     Wendy Heydemark           CO            <NA>
16 Schandenfreude Press                   I Blame My Mother      Klee      Hull           CA            <NA>
17 Schandenfreude Press What Are The Civilian Applications?     Sarah   Buchman           NY            <NA>
# The following shows a breakdown of what is actually happening when you 
# join multiple tables. All of the following "steps" happen at once when 
# you run the single query shown above in the answer. This breakdown
# of the steps helps you to understand what exactly is happening.

# step 1 - the first two tables are joined 
sql = "select authors.*, title_authors.*
    from authors join title_authors on authors.au_id = title_authors.au_id"
sqldf(sql)
   au_id  au_fname  au_lname        phone              address          city state   zip title_id au_id au_order royalty_shares
1    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T01   A01        1            1.0
2    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T02   A01        1            1.0
3    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T13   A01        1            1.0
4    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T06   A02        1            1.0
5    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T07   A02        1            0.5
6    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T10   A02        1            1.0
7    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T12   A02        1            1.0
8    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A03        1            0.6
9    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A03        2            0.3
10   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A04        2            0.4
11   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T05   A04        1            1.0
12   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T07   A04        2            0.5
13   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A04        3            0.3
14   A05 Christian     Kells 212-771-4680       114 Horatio St      New York    NY 10014      T03   A05        1            1.0
15   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T08   A06        1            1.0
16   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T09   A06        1            1.0
17   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T11   A06        1            0.4
# step 2 - the results from the join of the
#          1st two tables (authors and title_authors) is joined with
#          the 3rd table (titles)

sql = "select authors.*, title_authors.*, titles.*
    from authors join title_authors on authors.au_id = title_authors.au_id
                          join titles on title_authors.title_id = titles.title_id" 

sqldf(sql)
   au_id  au_fname  au_lname        phone              address          city state   zip title_id au_id au_order royalty_shares title_id                          title_name       type pub_id pages price   sales         pubdate
1    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T01   A01        1            1.0      T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00
2    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T02   A01        1            1.0      T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00
3    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T13   A01        1            1.0      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00
4    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T06   A02        1            1.0      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00
5    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T07   A02        1            0.5      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00
6    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T10   A02        1            1.0      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>
7    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T12   A02        1            1.0      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00
8    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A03        1            0.6      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00
9    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A03        2            0.3      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00
10   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A04        2            0.4      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00
11   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T05   A04        1            1.0      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00
12   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T07   A04        2            0.5      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00
13   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A04        3            0.3      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00
14   A05 Christian     Kells 212-771-4680       114 Horatio St      New York    NY 10014      T03   A05        1            1.0      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00
15   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T08   A06        1            1.0      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00
16   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T09   A06        1            1.0      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00
17   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T11   A06        1            0.4      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00
# step 3 - the results from the join of the first 3 tables (authors,
#          title_authors, titles) is then joined with the 4th table (publishers)

sql = "select authors.*, title_authors.*, titles.*, publishers.*
    from authors join title_authors on authors.au_id = title_authors.au_id
                          join titles on title_authors.title_id = titles.title_id
                        join publishers on publishers.pub_id = titles.pub_id" 

sqldf(sql)
   au_id  au_fname  au_lname        phone              address          city state   zip title_id au_id au_order royalty_shares title_id                          title_name       type pub_id pages price   sales         pubdate pub_id             pub_name          city state country
1    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T01   A01        1            1.0      T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P01    Abatis Publishers      New York    NY     USA
2    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T02   A01        1            1.0      T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
3    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T13   A01        1            1.0      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
4    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T06   A02        1            1.0      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P01    Abatis Publishers      New York    NY     USA
5    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T07   A02        1            0.5      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
6    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T10   A02        1            1.0      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P05       AAA Publishing      Berkeley    CA     USA
7    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T12   A02        1            1.0      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
8    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A03        1            0.6      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
9    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A03        2            0.3      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
10   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A04        2            0.4      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
11   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T05   A04        1            1.0      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
12   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T07   A04        2            0.5      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
13   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A04        3            0.3      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
14   A05 Christian     Kells 212-771-4680       114 Horatio St      New York    NY 10014      T03   A05        1            1.0      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
15   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T08   A06        1            1.0      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
16   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T09   A06        1            1.0      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P01    Abatis Publishers      New York    NY     USA
17   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T11   A06        1            0.4      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
# step 4 - the order by clause puts the rows in the correct order
sql = "select authors.*, title_authors.*, titles.*, publishers.*
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id"
sqldf(sql)
   au_id  au_fname  au_lname        phone              address          city state   zip title_id au_id au_order royalty_shares title_id                          title_name       type pub_id pages price   sales         pubdate pub_id             pub_name          city state country
1    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T01   A01        1            1.0      T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00    P01    Abatis Publishers      New York    NY     USA
2    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T02   A01        1            1.0      T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
3    A01     Sarah   Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468      T13   A01        1            1.0      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
4    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T06   A02        1            1.0      T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00    P01    Abatis Publishers      New York    NY     USA
5    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T07   A02        1            0.5      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
6    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T10   A02        1            1.0      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>    P05       AAA Publishing      Berkeley    CA     USA
7    A02     Wendy Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303      T12   A02        1            1.0      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
8    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A03        1            0.6      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
9    A03    Hallie      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A03        2            0.3      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
10   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T04   A04        2            0.4      T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00    P01    Abatis Publishers      New York    NY     USA
11   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T05   A04        1            1.0      T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
12   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T07   A04        2            0.5      T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00    P03 Schandenfreude Press       Hamburg  <NA> Germany
13   A04      Klee      Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123      T11   A04        3            0.3      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
14   A05 Christian     Kells 212-771-4680       114 Horatio St      New York    NY 10014      T03   A05        1            1.0      T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00    P02      Core Dump Books San Francisco    CA     USA
15   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T08   A06        1            1.0      T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00    P01    Abatis Publishers      New York    NY     USA
16   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T09   A06        1            1.0      T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00    P01    Abatis Publishers      New York    NY     USA
17   A06    Harvey   Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305      T11   A06        1            0.4      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00    P05       AAA Publishing      Berkeley    CA     USA
# step 5 - the select clause specifies which columns are actually displayed
sql = "select pub_name, title_name, au_fname, au_lname,
              authors.state as 'author state', 
              publishers.state as 'publisher state'
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id
  order by pub_name, title_name, au_lname, au_fname"
sqldf(sql)
               pub_name                          title_name  au_fname  au_lname author state publisher state
1        AAA Publishing         Not Without My Fabrerge Egg     Wendy Heydemark           CO              CA
2        AAA Publishing    Perhaps It's a Glandular Problem    Hallie      Hull           CA              CA
3        AAA Publishing    Perhaps It's a Glandular Problem      Klee      Hull           CA              CA
4        AAA Publishing    Perhaps It's a Glandular Problem    Harvey   Kellsey           CA              CA
5        AAA Publishing           Spontaneous, Not Annoying     Wendy Heydemark           CO              CA
6     Abatis Publishers                               1977!     Sarah   Buchman           NY              NY
7     Abatis Publishers           But I Did It Unconciously    Hallie      Hull           CA              NY
8     Abatis Publishers           But I Did It Unconciously      Klee      Hull           CA              NY
9     Abatis Publishers              Exchange of Platitudes      Klee      Hull           CA              NY
10    Abatis Publishers                    How About Never?     Wendy Heydemark           CO              NY
11    Abatis Publishers        Just Wait Until After School    Harvey   Kellsey           CA              NY
12    Abatis Publishers                     Kiss My Boo Boo    Harvey   Kellsey           CA              NY
13      Core Dump Books        Ask Yor System Administrator Christian     Kells           NY              CA
14 Schandenfreude Press           200 Years of German Humor     Sarah   Buchman           NY            <NA>
15 Schandenfreude Press                   I Blame My Mother     Wendy Heydemark           CO            <NA>
16 Schandenfreude Press                   I Blame My Mother      Klee      Hull           CA            <NA>
17 Schandenfreude Press What Are The Civilian Applications?     Sarah   Buchman           NY            <NA>