Showcasing my baseball analytics projects.
View the Project on GitHub jjsvenson/jj-svenson-baseball-analytics
Using R as a tool and Trackman CSVs as my data, I was able to craft a report that analyzes pitch sequencing.
When making my pitch sequencing report, my goal was to investigate how pitchers’ different pitch types performed following a specific pitch type. I was interested in calculating usage %, strike %, swing %, swing % for pitches in the zone, whiff %, whiff % for pitches in the zone, chase %, swinging strike %, called strike + whiff (CSW) %, ball in play %, out % on balls in play, ground ball %, line drive %, fly ball%, pop up %, and average exit velocity. To do this I used variables in a Trackman CSV and wrote functions in R to calculate these statistics. This included defining the strike zone based on the spatial axes Trackman uses. My code filters the data so that I can analyze, for example, all pitches after a fastball for a certain pitcher. I then put all of these statistics in a table, sorted by pitch type. My report includes one page per pitcher, and one table for each pitch type to show the results of pitches following that pitch type. However, because I was focused on pitch sequencing, I did not count the first pitch of an at bat as following the last pitch of the previous at bat because that does not help analyze pitch sequencing. The data only includes pitches that followed a certain pitch within the same at bat. Also, the report only includes pitches that are competitive, meaning they are within 8 inches of each corner and from plate level to a foot above the zone. This is in an attempt to reduce noise in the data from pitches that are uncompetitive and way out of the zone. For better interpretability, statistics where I felt it would be useful for them to be compared to league averages are automatically color coded (on a red to green scale where green is good for the pitcher) based on the rolling NCAA average and standard deviation for each statistic. The color scales are different for each pitch type because, for example, league average whiff% for a fastball is lower than league average whiff% for a slider. Therefore, each pitch type is color coded based on its custom scale. This report includes all of the data from the 2025 Iowa baseball season but the code can be used to analyze any Trackman data.
In this project, I wrote code in R to create a report analyzing pitch sequencing when given Trackman data.
I am always looking to improve my reports and to hear the opinions of others. If you have any questions or comments, feel free to connect with me at svensonjj@gmail.com or view my LinkedIn!