The purpose of this exercise is to help you learn various ways to edit the visual appearance of map data. For this exercise you will work on changing multiple aesthetics of the data to create a cartographically sound visualization that accurately depicts the information.
Based on 2020 net population growth data, Clarksville, Tennessee is one of the fastest growing cities in the state. The Clarksville-Montgomery County Regional Planning Commission (CMCRPC) and University of Tennessee researchers predict that the area will add about 90,000 new residents by 2040. According to the American Community Survey conducted by the U.S. Census Bureau, Montgomery County currently has a population of approximately 200,000 with nearly twenty percent (20%) being school aged children. With a two percent (2%) population increase in the past year, the Clarksville-Montgomery County School System (CMCSS) developed a plan to build a new elementary, middle, and high school in the eastern portion of the county. An independent education advocacy group has hired you to help determine if CMCRPC and CMCSS selected the best location for the new school complex considering the population projections.
In this exercise you will:
Software specific directions can be found for each step below. Please submit the answers to the questions and your final map by the due date.
The data you will be using for this exercise comes from the 2019 American Community Survey conducted by the U.S. Census Bureau and from the Clarksville-Montgomery County School System.
As with the previous exercise, you will begin by launching ArcGIS Pro from the start menu or desktop shortcut. On the initial splash screen select Map from New > Blank Templates in the middle section of the screen. On the next screen, provide a name for your new project, and set the location to folder on the computer where you are storing all of your GIS1 exercises.
This will save and open a new project with the World Topographic Map and World Hillshade in the Table of Contents.
In order to complete this exercise you will need to download the Census Track and CMCSS Schools zip files and extract them in your project folder. You can use those links or navigate to the Exercise 3 GitHub Page and click on Data > Shapefiles and click on each zip file to find the download button for the dataset. Once downloaded and extracted, these files should be available to your project.
On the Map tab, click the Add Data button and navigate to your project folder containing your data and add the census_tracts and cmcss_school shapefiles to your view. Your screen should look similar to this:
If you have questions about this step you can always refer back to Exercise 2, Step One for more detail.
Question No. 1How many elementary schools are in Montgomery County and how are they identified differently from middle and high schools in the attribute table?
As with the previous exercise, you will begin by launching QGIS Pro from the start menu, launchpad, or desktop shortcut. On the initial splash screen select New Empty Project from Templates in the middle section of the screen.
In order to complete this exercise you will need to download the Census Track and CMCSS Schools zip files and extract them in your exercise folder. You can use those links or navigate to the Exercise 3 GitHub Page and click on Data > Shapefiles and click on each zip file to find the download button for the dataset. Once downloaded and extracted, these files should be available to your exercise.
In the Browser window on the left, navigate to the folder where you saved and unzipped the exercise data folder. Once you locate the folder in the browser, you can select the montco_tracts.shp and cmcss_schools files and click-and-drag them to the map canvas. Your screen should look similar to this:
If you have questions about this step you can always refer back to Exercise 2, Step One for more detail.
Question No. 1 Using the attribute table for cmcss_schools answer the question below:How many elementary schools are in Montgomery County and how are they identified differently from middle and high schools in the attribute table?
Recall from the previous laboratory that you will begin by opening the appropriate Colab notebook for each exercise. In order to maintain consistency across labs, each repository will contain a new notebook with a constant nomenclature:
Each notebook can be found in the corresponding GitHub repository, and following the directions in the introduction for Exercise 2, can be opened by inserting tocolab immediately after github in the web address. So the URL for this exercise would read:
https://githubtocolab.com/chrismgentry/GIS1-Exercise-3/blob/main/GIS1_EX3.ipynb
You can copy/paste the link above or navigate to the file in the Exercise 3 Repository, click on the link for GIS1_EX3.ipynb, and add tocolab to the URL. In future labs you can simply change the repository and notebook numbers to match the exercise number to directly open each. Remember to save a copy of the notebook in your Google Drive. If you have any questions or need reminders of how to work in the Google Colaboratory Environment click here or refer back to the information provided in Exercise 2 Notebook.
Similar to Exercise 2, you will use the tidyverse
, maps
, and ggsn
packages to complete the exercise. In addition to these packages, you will also need the tigris
package. This will provide direct access to Topologically Integrated Geographic Encoding and Referencing (TIGER) data from the U.S. Census Bureau within R. Information on the package can be found at the following link.
#install.packages('tidyverse')
#install.packages('maps')
#install.packages('ggsn')
#install.packages('tigris')
library('tidyverse')
library('maps')
library('ggsn')
library('tigris')
With the appropriate packages installed and loaded to your environment, you can now use the tigris
package to obtain census tract data for Montgomery County. For this you will use the tracts()
function and first identify the state (by abbreviation) and the county name. Remember to use the <- operator to make an object from the data.
<- tracts("TN", county = "Montgomery") montco_tracts
Examining the dataset you will see information regarding the state and county codes as well as other information regarding the polygons geometry. There is also a variable for NAME that details the census tract identification you will use in an upcoming step to add population information. Because this dataset is created using a specific object type called simple features you will need to add the polygons to ggplot2
using the function geom_sf()
instead of the geom_polygon()
function we used in the previous exercise. To create a simple visualization of the data you will use:
ggplot(data = montco_tracts) + geom_sf()
If you were interested in obtaining information for the census tracts around Nashville, how would you alter the code above?
In this step you will organize and display the data in order to prepare it for the final visualization.
Before you can analyze the relationship between student population and the proposed new school, you need to create a marker for where the school will be located. To do this you will click the Go To XY button on the Map tab. This will open a new menu at the bottom of the screen where you can input longitude and latitude values.
In the new menu, type the longitude and latitude values in the boxes provided and click the Mark Location button to add a marker. The longitude and latitude values are:
Longitude = -87.1950 Latitude = 36.5680
If you right-click on the new Graphics Layer and select properties, you can go to the General tab and change the name to “New School”. To change the style or look of the marker, be sure to have the New School element selected in the Table of Contents , on the new Graphics Tab use the select tool to select the marker and use the options within the Symbology portion of the tab to alter the symbol, fill color, outline stroke, etc.
Finally, similar to the steps in Exercise 2, Step 2, you need to use graduate color to symbolize the various tracts by the pop_5_18yo population data. To do that you will use the symbology button to gain access to the graduated color menu. Be sure to have montoco_tracts selected in the Table of Contents and click on the Appearance  Tab in the Feature Layer Menu. Click on the option for “Graduate Color” from the drop-down menu.
In the resulting Symbology window, select the following options:
You can now close the symbology window and view the results.
By using the attribute table, which tracts (NAME column) have the largest number of males and females in the dataset?
Before you can analyze the relationship between student population and the proposed new school, you need to create a marker for where the school will be located. To do this you will go to Layer > Create Layer > New Temporary Scratch Layer on the menu bar. This will open a new menu where you will create a temporary point named New School.
With the new scratch layer added, be sure the layer is selected in the Layers Section, and turn editing on by using the Toggle Editing button. With editing on, the use the Add Point Feature button and click on the map to create a point. Because we are going to edit the longitude and latitude it does not matter where you place the point. With the point added, click the Vertex Tool button button and double-click on the point you made.
This will open the vertex editor options and allow you to edit the longitude and latitude coordinates of the point.
Longitude = -87.1950 Latitude = 36.5680
In the editor change the values to the longitude and latitude listed above. This will adjust the point and place it in the appropriate location.
With the point moved, you can now use the symbology properties to change the new school point to make it unique from the current schools.
Finally, similar to the steps in Exercise 2, Step 2, you need to use graduate color to symbolize the various tracts by the pop_5_18yo population data. In the right-click/control-click menu for montoco_tracts click properties and navigate to the symbology tab. In the resulting Symbology window, select the following options:
You can now close the symbology window and view the results.
By using the attribute table, which tracts (NAME column) have the largest number of males and females in the dataset?
Now that you have polygons for the census tracts in Montgomery County, you will need to merge the population information. Similar to Exercise 1, Step 1, you can use the read.csv()
function to obtain the population dataset. However, there is one alteration to the code you will make for this exercise. One of the variables in this dataset titled “Tract” needs to match that of “NAME” in the census data. In the census dataset it is set to character (meaning text) and in the population data it is set to double (a type of numeric data). In order to make them match, you can add colClasses=c(Tract="character")
to the read.csv()
function to change the variable on import. The script would look like this:
<- read.csv('https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-3/main/Data/mont_co_pop.csv', colClasses=c(Tract="character"))
population head(population)
Notice in the table generated with the head()
function the variable now has <chr> listed as its class. This is a useful addition for future scripts as you can generally identify any variable and change its class prior to importing the data.
Using the same process as in Exercise 2, Step 2, you will use the merge()
function to combined the montco_tracts and population datasets. When merging the datasets you need to identify which datasets (x and y) and which common variable (by.x and by.y) will be used for the merge.
<- merge(x = montco_tracts, y = population, by.x = "NAME", by.y = "Tract", all = TRUE) tract_pop
Using the str()
or head()
function you can view the structure of the new dataset. Familiarize yourself with the different variables for future analyses. More information on the merge()
function can be found by Googling merge() function r. You can now use ggplot
to visualize the data based on a variable such as total_pop (total population).
ggplot() + geom_sf(data = tract_pop, aes(fill = total_pop))
Before you can analyze whether or not the new school is being built in the appropriate location, you need to add the information for the current school as well as the proposed school location.
Like adding the population data earlier in this section, use the <- operator to create an object and the read.csv()
function with the following link to connect to the current school dataset:
https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-3/main/Data/county_schools.csv
Unlike the previous step, there is no need to alter the dataset using colClasses
. Use head()
to view the variables. Finally, you need to create a new data point for the new school. For this you will create a data frame that includes variables for longitude, latitude, and name.
A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column.
To create the data frame you will use the following script:
<- data.frame(x = -87.19495519624655, y = 36.56793328339189, name = "New School") new_school
In this script data.frame()
is the function and x
= the longitude, y
= the latitude, and name
= is the variable for the name. Use head()
to view the data frame.
summary(tract_pop$total_pop)
can be used to summarize the total population variable.
Using the example above, replace the variable and answer the following question: What is the maximum number of males and females in the dataset?
You will now create a graphical display of your data that includes cartographic elements such as legend, scale bar, north arrow, etc.
Following the same directions in Exercise 2, Step 3 you will now create a map of this data. Remember, to complete your visualization you will need to add: the data, title, legend, north arrow, scale bar, and your name and date. Because all of the information for adding these elements is detailed in the previous exercise, this exercise will not walk through adding them step-by-step.
From the Insert Tab click the New Layout button and select ANSI - Landscape, Letter 8.5" x 11" to set up the page.
As you are working to create your final visualization, to edit, move or adjust each individual element of your map you must have it selected in the table of contents. When adding text elements, it might help to rename each of the text elements to be able to correctly identify them.
You should use your own style and creative ideas to personalize your map. The images used in the exercise are examples of how different elements can be displayed.
Question No. 3What is the relationship between the location of the new school and the population of children within the census tracts?
Following the same directions in Exercise 2, Step 3 you will now create a map of this data. Remember, to complete your visualization you will need to add: the data, title, legend, north arrow, scale bar, and your name and date. Because all of the information for adding these elements is detailed in the previous exercise, this exercise will not walk through adding them step-by-step.
From the menu bar click Program > New Print Layout.
As you are working to create your final visualization, remember to right-click/control-click on each item to access the properties that will allow you to customize the items. Refer back to the previous exercise (linked above) for more information on each item.
You should use your own style and creative ideas to personalize your map. The images used in the exercise are examples of how different elements can be displayed.
Question No. 3What is the relationship between the location of the new school and the population of children within the census tracts?
With these datasets you are now ready to create a map to examine the distribution of school-aged children across the county in relation to the current schools and future school. In Step 1 and 2 of this exercise and Exercise 2, Step 3 you used ggplot2
to visualize the data. However, because of the difference in data types, in this exercise you used the geom_sf()
function instead of geom_polygon()
like in the last exercise. Building on the previous script, you will now use geom_point()
to add the schools information. Remembering back to the introduction, the goal of this analysis is to help determine if the location for the new school is appropriate based on the needs of the county. This means the fill portion of the aesthetics will need to represent the correct variable in the dataset.
Please refer back to Exercise 2 if you need a refresher on some of the basics elements of mapping data in ggplot2
. The construction of this map will be very similar to the previous final map. You will use a combination of ggplot2
to plot the information and ggsn
to add the scale bar and north arrow. In this exercise you are going to add some additional theme information to further customize your map.
To begin you will use a similar script from Exercise 2, Step 3 substituting the necessary objects, variables, and swapping geom_polygon
for geom_sf
:
ggplot() +
geom_sf(data = tract_pop, aes(fill = age5_18yo), color = "white") +
scale_fill_viridis_c(option = "D")
In this example, you set the data object to tract_pop which was created by merging the census tract and population datasets. In the aesthetics (aes()
), the fill is set to the variable age5_18yo (a variable indicative of school-aged children). When used within the aes()
, fill will be used to graphically sort the dataset; in this case by the 5-18yo age bracket. Exercise 2 you used the scale_fill_viridis_c()
function to set the color ramp of the data. Refer back to Exercise 2 or Google scale_fill_viridis_c() for more information on the function and its color options.
With the census tracts added, you need to add the CMCSS school and the new proposed school. This will be accomplished by using the geom_point()
function and identifying the appropriate objects and aesthetics.
ggplot() +
geom_sf(data = tract_pop, aes(fill = age5_18yo), color = "white") +
scale_fill_viridis_c(option = "D") +
geom_point(data = schools, aes(x=X, y=Y), size = 3, shape = 21,
fill = "black", color = "white") +
geom_point(data = new_school, aes(x=x, y=y,),
shape = 8, size = 5, color = "red")
When you examined the structure of the schools and new_school you should have noticed x and y variables in each dataset. This information is used to plot the location of the points in the aes()
portion of each geom_point()
function. Only the information used to identify the axis location (x,y) or modify the geometry (polygon, point, etc.) based on a variable in the dataset should be included within the aes()
. Any other modification of the geometry not connected to the dataset should be outside of the aes()
. So for this example size, shape, color, and fill are added to customize the points.
aes()
) alters the color of the geometry or border. Most any hexadecimal value or color name can be used in R. Regardless of the method used, the values must be enclosed within quotation marks. A list of color options can be found hereaes()
) like color modifies the appearance of the geometry. When used with a geometry that has a border, fill refers to the geometry and color refers to the border. When attempting to modify a solid object, use color to customize the geometry.You should feel free to experiment with different size, shape, fill, and color aesthetics to customize your own maps.
Next you need to add the scalebar()
, north()
arrow, and name/date of the creator (using annotate()
) to the map. Remember that the location of these elements are defined by the gridded location on the map. Because this map is centered on Montgomery County, the range of longitude varies from about -87.10° — -87.60° West to 36.30° — 36.65° North. So all of the anchor = c()
values should be within those ranges.
ggplot() +
geom_sf(data = tract_pop, aes(fill = age5_18yo), color = "white") +
scale_fill_viridis_c(option = "D") +
geom_point(data = schools, aes(x=X, y=Y), size = 3, shape = 21,
fill = "black", color = "white") +
geom_point(data = new_school, aes(x=x, y=y), shape = 8, size = 5,
color = "red") +
north(tract_pop, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x=-87.15, y=36.31)) +
scalebar(tract_pop, dist = 5, dist_unit = "mi", transform = TRUE, model = "WGS84", location = "bottomleft",
st.dist = 0.02, st.size = 2, anchor = c(x= -87.65, y= 36.31)) +
annotate("text", x = -87.22, y = 36.35, label = "Chris Gentry\n August 2021", size = 2)
Additional details about the north arrow and scale bar can be found in Exercise 2 or by Googling ggsn package r. Having the longitude and latitude values currently displayed on the map should help you adjust the position of these elements. However, in the final map these are not be necessary, so adding theme_void()
in the next step will remove the grid, gray background, and displayed x,y axes.
To complete the map, you need to add a title, adjust the name of the legend to something more appropriate for the data, and add a label for the new school. Similar to the previous exercise you can use labs()
to add title=
information. To add the label to the new school you will use the geom_text()
function. In order to customize the position of the label, nudge_y=
and nudge_x=
values can be added. The data and aes()
scripts will match the information from the geom_point()
with the addition of a label option to the aes()
and other customizations. To alter the legend title, a text option needs to be added to the scale_fill_viridis_c()
function.
ggplot() +
geom_sf(data = tract_pop, aes(fill = age5_18yo), color = "white") +
scale_fill_viridis_c(option = "D", "No. of Children") +
geom_point(data = schools, aes(x=X, y=Y), size = 3, shape = 21,
fill = "black", color = "white") +
geom_point(data = new_school, aes(x=x, y=y), shape = 8, size = 5,
color = "red") +
north(tract_pop, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x=-87.15, y=36.31)) +
scalebar(tract_pop, dist = 5, dist_unit = "mi", transform = TRUE,
model = "WGS84", location = "bottomleft",
st.dist = 0.02, st.size = 2, anchor = c(x= -87.65, y= 36.31)) +
annotate("text", x = -87.22, y = 36.35, label = "Chris Gentry\n August 2021", size = 2) +
labs(title = "School Aged Children per Census Tract and CMCSS Locations") +
geom_text(data = new_school, aes(x=x,y=y, label = name), nudge_y = 0.025, nudge_x = 0.0, color = "white") +
theme_void()
What is the relationship between the location of the new school and the population of children within the census tracts?
After presenting this information at the advocacy group meeting, one of the committee members asks how this information will change over the next 5 years. Fortunately you already have the information you need to determine this. In the population data set there is a variable for children that are in the 0-5 year old age range. This is the “future” student population. Edit your map to show the census tract based on this variable instead of the current student population (age5_18yo). Software specific directions can be found for each step below. Please submit the answer to the questions you answered as well as your final map by the due date.
In order to answer the question above, you need to examine the attribute table and determine which variable contains the information for children aged 0-5 years old. For your final map, you should be able to keep the majority of your map the same and just alter the field for the graduated color. Be sure to determine what the appropriate interval size should be as the values for children 5-18yo will be different than the number of children between 0-5yo.
Question No. 4How did census tract populations change with the move from 5-18 year old to 0-5 year old?
In order to answer the question above, you need to examine the attribute table and determine which variable contains the information for children aged 0-5 years old. For your final map, you should be able to keep the majority of your map the same and just alter the field for the graduated color.
Question No. 4How did census tract populations change with the move from 5-18 year old to 0-5 year old?
In order to answer the question you simply need to adjust the fill variable. Remember you can use head()
or str
to view the variables in the dataset. Using the scrip from step you should change the following:
geom_sf(data = tract_pop, aes(fill =))
to the variable for 0-5 year old childrenHow did census tract populations change with the move from 5-18 year old to 0-5 year old?
In the report you provide to the advocacy group please provide the following information:
When complete, send a link to your Colab Notebook or word document with answers to Questions 1-4 and your completed map via email.