This course is designed for individuals with little or no Alteryx Designer experience who are looking to learn and understand how to use Alteryx. This comprehensive course will cover all of the basic building blocks needed to create data preparation and automation workflows.
We’ll start by teaching the basics: reading, writing, exploring, and cleansing your data. From there, the course covers how to create various filters and calculations.
After you know the basics of building a workflow, it will be time to learn how to combine multiple data streams together with joins, unions, and appends.
Next, you will learn how to pivot your data set, as well as best practices for documenting and organizing your workflow.
We round out the course with some tips for navigating the online Alteryx Community.
Along the way you’ll be guided through the content with a coach. Each week that coach will host two live two-hour training sessions with the entire cohort. If you can’t make it to the sessions you can catch up with the recordings that will be sent to you afterwards.
Your coach will also set aside time for a 30-minute one-on-one meeting each week with you. In this meeting you will have additional opportunities to ask questions and maximize your learning.
Outside of the training sessions and meetings you’ll also have access to our comprehensive lessons. These include step-by-step written instructions, GIF images, and video!
Select a Cohort
Select a start date for your Data Coach experience keeping the following in mind:
Begins the second Monday of every month.
Structured content for 4 weeks.
8-week access to content after cohort completion.
Start Date: April 13, 2020
End Date: May 8, 2020
Limited seats remaining.
PRIMARY COHORT COACH
Software Training Expert
Start Date: May 11, 2020
End Date: June 5, 2020
Limited seats remaining.
PRIMARY COHORT COACH
Software Training Expert
Special Content Preview
For this topic, we’re switching things up a bit for you: until now, we’ve used the Formula tool to create row-by-row calculations. Aggregation calculations differ by performing a calculation over a series of values at once.
Alteryx carries out aggregation calculations with the Summarize tool, which is one of the most versatile tools available. We can use the Summarize tool to find total sums, averages, group by dimensions, and so much more.
In this topic, we will discuss some of the functionality of the Summarize tool before stepping into some examples.
The Summarize Tool
Like we mentioned above, the Summarize tool is extremely powerful. Imagine you have a set of sales data where each state has an associated region. With the Summarize tool, you can group by region to find the total sales from each.
Here is a list of the things the Summarize tool can do:
- Group by dimensions
- Sum values
- Count values
- Count distinct values
- Count null or non-null values
- Find the min or max
- Find the first or last alphabetically
- Perform financial calculations
- Perform numeric calculations, such as average, variance, percentile, and mode
- Perform string calculations
- Perform spatial calculations, such as finding the intersection between polygons
- Perform behavior analysis and reporting functions
Keep in mind that for the above functions to work you must have the correct data. If you don’t have any spatial data, the spatial functions here will not work.
The Summarize Tool in Practice
Finding Group Sums and Averages
For our first example, we will go back to our per order sales data. Let’s briefly revisit it here:
Our goal in this example is to find the total and average sales and profit for each state.
That tells us we will need to group by the state field, sum by sales and profit, and average sales and profit.
Repeat the steps from the image above for profit. Note that you may rename fields in the column that reads “Output Field Name.”
Once you’ve done that, go ahead and run the workflow. Your results should look like this:
Try sorting the results by average sales. What state is leading the pack? If the answer surprises you, try adding a count of sales to the data set. It might seem less surprising once you view the number of sales by state.
In this practice, we will examine how to use the Summarize tool to aggregate spatial data.
Download the Alteryx workflows here:
We will need the following tools for this topic:
If you are unfamiliar with the purpose or configuration of the tools above, please complete the associated topic for that tool.
In this problem, we have provided you with two files containing geographic information.
- A shapefile with records for each state and region in the United States.
- A spreadsheet with records for the 1,000 largest cities in the United States.
Your goal is to find the centroid for each region, the center of mass (based on population — calculated for you) for each region, and then finally to calculate the distance between the two points for each region.
This exercise uses tools that we haven’t introduced you to, so you’re likely to not be familiar with them. We have documented the workflow in such a way that new tools are provided and explained so you can “fill in the blanks.”
Centroids & Centers of Mass
If you’re not familiar with geographic analysis, these two terms may not have much meaning to you. Thankfully, they are simple terms to understand, so we’ll take some time here to explain them.
A centroid is the unweighted center point of a polygon. Picture a circle — the centroid would be the exact center of the circle.
A center of mass is the weighted center point of a polygon. Imagine a bunch of people of various weights standing all over a disk – the balancing point of the circle will no longer be in the exact center, but will be skewed to some side due to uneven weighting.
In the image above, we calculated the centroid (red marker) and center of mass (green marker) for the state of Florida. Notice how the center of mass is farther to the south and east than the centroid which makes perfect sense when you consider the enormous population of southeastern Florida.
When you’ve completed the exercise, your final data set should resemble the following:
Please attempt the practice workflow before moving onto the solution section
View the video or read the solution instructions below.
Start with the practice workflow for this exercise.
After examining each of the input data sources, you’ll notice that the only field they can be joined on is State. Create an inner join so each city record will include the Region and SpatialObj field from the shapefile. Rename the field SpatialObj to State Polygon.
Once the join is complete, the next step is to use a Summarize tool to group by Region and create regional polygons and centroids.
Grouping by Region and then using the Combine function on the State Polygon field results in polygons — a spatial object — that encompass entire regions rather than states. The centroids we created are also grouped by Region, so in this step we’ve taken care of two major calculations. If you were to put a Browse tool after, you would see a map like this:
Moving to the next step, we need to create weighted average latitude and longitude values for each region. Since we haven’t discussed the Weighted Average tool in this course, we don’t expect you to know about it.
The Weighted Average tool is easy to use. Simply identify the numeric field (Latitude and Longitude), what field to weight by (Population), give it a descriptive name, and what field to group by (Region).
Your next step is to join the two weighted average calculations. Since they were grouped by Region, this is the field we will join by.
Because we treated the coordinates as numbers rather than a spatial object, we now have to use the Create Points tool to create a new spatial object.
The Create Points tool creates a spatial object called Centroid. Using a Select tool, rename that field to Center of Mass. You may also deselect the latitude and longitude fields here.
At this point, we have created two data streams: one with regional polygons and centroids and one with regional centers of mass. We now need to use a Join tool to combine the two streams. Set the key field to be Region and deselect the duplicate region field.
We’re almost done – just one more new tool to go. Now that the two streams have been joined, we have a data set with one record for each of the nine regions with polygons and the two point fields. Using the Distance tool we can calculate the distance between two spatial objects.
The Distance tool returns the distance between Region Centroid and Center of Mass for each record in the data set. Now all that’s left is to sort to show the longest distance first and we’ll be all done.
Connect a Browse tool and you’re done. In this exercise, we used several unfamiliar tools and saw a couple of new functions available with the Summarize tool.