The Gender Pay Gap in Data Science Salaries
The gender pay gap is a contentious issue, especially in tech where women are historically excluded. We can explore the gap in Data Science salaries a little with the same Insight data I used last time to look at Data Science salaries in general.
Others have looked into the same question before: Florian Lindstaedt used a much larger (but less clean) dataset from Kaggle to look at the issue on his blog. He found that for data scientists younger than 30, women earned slightly more, but in the 30-35 age group men earned more.
My data is much smaller, but better curated. However, it has some biases in that it is collected from Insight alumni who are mostly:
- Early career
- In high-demand markets
- Coached on salary negotiation
Asking the respondent’s gender was added to the survey late, so around a third of the data does not have that information. This leaves us 79 men and 28 women. Not a huge sample, but better than nothing.
Of course, this low number of woman might itself be a further bias: Insight generally has pretty gender-balanced cohorts, so that fact the many fewer women have filled out the survey is worrying. It is possible that non-response is correlated to the underlying distribution, for example, perhaps people who are paid less refuse to report.
The data used in this post is available here. The notebook with all the code is here (rendered on Github).
Pay: Men Vs. Women
Here is total recurring compensation1 by gender. I have removed all non-data scientists (like the MLEs I looked at last time) because there are very few responses from them. I have also removed the one data scientist who responded “transgender” without further indicating their gender identity.
So, how is pay equality in data science?
Pretty equal, actually! The median woman in the sample earns more than the median man, but of course the number of samples is really small.
|Gender||Median Total Compensation|
There are lots of things I would like to explore—like “do women see the same benefit from seniority as men?”, as I observed last time—but I just do not have enough women in the sample to say anything conclusive.
Instead I will look at salaries by region (which I know drives large pay differences) and age, which Florian looked at.
Only California (LA, San Francisco, and Silicon Valley) and the Northeast (New York, Boston, and DC) have enough respondants to form any reasonable conclusions, so I limit my sample to those regions.
Again, these look pretty equal, with the median woman earning slightly more than the median man in both regions.
|Region||Gender||Median Total Compensation|
Finally, I can check what Florian found: that women under 30 earned more than men in the same age range, but men out earned women in the 30–35 age range. I use the same selection as above, but now partitioning by age instead of region.
I do not see Florian’s trend; instead the salaries look roughly equal, with the median woman earning more in every age group, as shown below:
|Age||Gender||Median Total Compensation|
|0 to 30||Female||$155k|
In my small dataset, women in data science earn the same as men, and they do so across regions and age groups. I wish I could have explored more slices of the data to look at things like seniority, percent of compensation in stock, etc., but slicing the data very quickly reduces the number of data points beyond usefulness.
Salary, yearly bonus, and yearly stock grant. Signing bonus is not included. ↩