Welcome to Part 3 of Mango's video tutorial series on sourcing and preparing map ready data.
The series will help you to source third party geospatial data for your mapping projects, clean and prepare them for upload to an online GIS mapping platform, and visualizing your data on interactive web maps.
This video will show you how to clean up third party map data to zero in on your areas of interest.
If this is your first time viewing this series, we recommend you start with Video 1: Why Maps are Superior to Spreadsheets.
To complete these tutorials, you'll need to download QGIS, and sign up for a free 30-day trial of Mango.
Watch the next video:
How to Stack Layers on Your Map
Watch other videos in this series:
- Video 1: Why Maps are Superior to Spreadsheets
- Video 2: How to Find Data for Your Maps
- Video 3: How to Clean Up Your Map Data [WATCHING]
- Video 4: How to Stack Layers on Your Map
- Video 5: How to Map a Spreadsheet Containing Regions or Areas
- Video 6: How to Map Addresses From an Excel Spreadsheet
Exclusive freebie! The Beginners Guide to GIS eBook. 108 pages, completely free.
Grab your copy today.
VIDEO TRANSCRIPT
Hello and welcome back!
In the previous video in the series we showed you how to join map data to your spreadsheet data using QGIS, and then how to upload that Shapefile into Mango go in order to make a quantity map.
In this video we're going to show you how to clean up the map data. The map data that we uploaded to Mango was a little bit messy; it had lots of counties that we weren't interested in, and the data also contained lots of columns from the original map data that aren’t of interest to us.
So in this video I'm going to show you how you can clean that data up before you go ahead and upload it to Mango.
The first thing that we're going to do is we're going to remove any unwanted columns from the data we do this by right-clicking on sales data, pressing open attribute table, so we now have the attribute table open - now before we can do any editing in QGIS we need to press the edit button, so now the data set becomes editable, and then we need to click on this delete column button that's in the toolbar at the top. Now in here we can select the columns that we would like to remove.
Now if we want the data to match our spreadsheet, all we had in our spreadsheet if you remember, was name, ID, sales, population, and income. So these are the only columns that we really need in our map data also.
So we're going to remove STATEFP, COUNTYFP, and all of the items that aren't necessary for our purposes. So if I press OK, we'll see that those columns have now been removed from the dataset, and now if I press the edit button again to leave editing mode is going to prompt me to save the changes I've made.
So I press save and now the data set has been permanently modified.
So back in QGIS, we've now removed the columns that we don't need, the next step is to remove all of the counties that aren't in Texas.
To do that we once again open up the attribute table, and once the attributes table is open - as before we need to put it into editing mode - now we've now removed the column that says the name of the state that the county is contained in, so how we going to remove the features that we're not interested in; the features that out that are outside of Texas?
Now one attribute that they all have in common is we have sales data for those counties, and all of the counties that are outside of Texas contain null values, so if we scroll down the map here are the null values. So what we want to do is remove those null values. Now to remove those features we need to press this button here, select features using an expression.
So what we need to do, the expression will allow us to perform a query to select all of the features that we're interested in moving, and we have this helpful toolbar here that says all of the options that are available. So what we want to do is we go to fields and values, we want to go to sales, and we want to select all of the features where sales is null.
Now when you perform the query you'll see this output preview down here: 1 means true; 0 means false, and you'll see an error here if your query has been incorrectly typed.
Now all of the options that are available within this expression tool are too complex to cover with in this video tutorial, but QGIS has comprehensive documentation, so if you want to see how you can perform specific queries to select features for your specific data set, then please refer to the QGIS documentation.
So now we're going to press select, and we're going to press close. Now if we scroll down we can see that all of the counties in Texas are not selected and all of the counties outside Texas are selected, and the selection has is also reflected on the map.
Okay so once we've made that selection, we now want to press this button here which is the delete selected features button.
Once that's done, all of the counties that didn't meet our criteria has now been deleted, and as you can see on the map we now only have the counties for Texas. To save that change once again we exit editing mode, we press save, and it's done.
We now have a clean dataset. We have a data set where the attribute table only contains the columns that we're interested in, and it only contains the counties that we're interested in.
So now we have our nice shiny and clean data, let's go back into Mango, and let's remove that data that we had before that contained all of the features we're not interested in, all of the columns we're not interested in, and upload that new clean dataset
So we go to the layers we're going to press the delete button on the sales data, and we're going to press the add layer button again, and we're going to upload our new clean sales data.
Once again, we want to do a quantity legend, as we can see in the drop down we only have sales population and income now. We're going to choose sales. Let's change the background the outline color to a dark gray like we did last time. Press save.
And there we have it: the sales data only for the counties in Texas. Also if we click on one of the individual features to see the attribute table, we can see that the attribute table only contains the data that's that we're interested in: it has the ID, the name the number of sales, the population, and the income.