For information about COVID-19, including symptoms and prevention, please read our COVID-19 patient guide. Please also consider supporting Weill Cornell Medicine’s efforts against the pandemic.

Finding Community With R-Ladies

Data science is one of the fastest-growing fields in the United States. Touted as a career of the future due to the increased demand for data-driven decision-making across all sectors, data science is dominating the job market. It boasts a 6.5x growth in job openings since 2013. Despite this tremendous growth, gender disparity continues to be an issue with only 16% of data science roles filled by women. Fortunately, R-Ladies has made it its mission to turn this statistic on its head. 

Required skills for data scientists can vary from statistics to machine learning to data visualization, depending on the needs of the company and the role. One skill that is necessary across the board is programming. Over the past decade, R has proven to be a leading programming language supporting inclusivity, due to a movement generated by R-Ladies, a global organization that aims to achieve proportionate representation of minority genders in the R community.

Founded in 2012 by Gabriela de Queiroz in San Francisco, R-Ladies currently sponsors 187 groups in six continents and 51 countries. The New York City chapter is an especially popular hub for seasoned and beginner coders alike, including the biostatisticians in the Department of Population Health Sciences at Weill Cornell Medicine.

Dr. Elizabeth Sweeney, assistant professor of population health sciences in the Division of Biostatistics and R-Ladies NYC board member, first learned about the group two years ago when her previous employer hosted an event. “I saw the presentation and decided I wanted to be a part of it. I immediately fell in love with the community.”

Katherine Hoffman, research biostatistician I, and Elizabeth Mauer, research biostatistician III, found the group online through Meetup and Twitter, respectively.  

Elizabeth Mauer

Elizabeth Mauer, research biostatistician, presenting at R-Ladies NYC.

One of the main draws of R-Ladies are its monthly talks that introduce new tools and techniques to R users. Last fall, Dr. Sweeney organized an event at Weill Cornell Medicine with Hoffman and Mauer as the presenters. Mauer’s topic focused on parameterized reports: “In R, I use a platform called R Markdown to automate analysis reports for clinician investigators. Oftentimes clinicians want the same report, but for a subset of their data, or perhaps don’t want to/need to see all of the detailed statistical output. A lot of R users don’t know that you can incorporate parameters into your reports to avoid copy/pasting of source code. With only a few extra lines of code, you can run the same source code under different circumstances.” 

Hoffman’s focus was on machine learning: “I went through an overview of the Superlearner machine learning algorithm and showed a demo of a package to implement it. Not everyone who uses R is a statistician, but I wanted to make learning an advanced statistical modeling technique more accessible to a wide range of people.” 

“They were both a big hit,” related Dr. Sweeney. “A wide variety of people attends these events. Some people in academia, some people in industry, and some people who want to switch careers by getting started in R.”

According to R Forwards, 11.45% of package maintainers were women in 2016, an increase from 9% in 2010. In a gender ratio analysis done on Github, a software development platform, women represent 9.3% of contributors in R. In contrast, women only account for 2% of contributors in Python, another language commonly used in data science. Conferences have also grown to be more inclusive—nearly half of the scheduled speakers are women at the 2020 R Conference in New York.

While there’s still a lot of work to do, it is clear that R-Ladies has made significant progress and its success is rooted in the tremendous support received from prominent leaders in the R community. Hadley Wickham, the developer behind popular packages such as ggplot2 and dplyr, has been a champion for diversity and his voice has been a unifying force for the community. “The R community is just an easy place to ask questions without fear of ridicule. I think Hadley Wickham, along with many other prominent R influencers, have a lot to do with this culture”, shared Mauer. R-Ladies is also a top-level project for the R Consortium, a collaborative group working to support to R users, maintainers, and developers.

Community is also a cornerstone of R-Ladies. Meetings allow members to engage with peers beyond the walls of their office and develop professional exposure.

“The biggest thing I got out of it is friendship and networking. There was an R Lady who was scheduled to present at the New York R Conference. She wasn’t able to give her talk, so she gave me her slot,” recounted Dr. Sweeney. “It was an excellent opportunity for me to speak in front of a large number of people and get the message out about my research. We look out for each other.”

“R-Ladies has done a lot for my confidence. I’ve gotten more comfortable with public speaking and talking about statistics and programming. The entire process of learning a topic very well, and then figuring out the level of detail that is right for my audience, is really enjoyable for me,” added Hoffman.

“In the past, I’d always been someone on the sidelines. I’d read Twitter posts to stay up-to-date with R, but I never engaged,” shared Mauer. “Speaking at an R-Ladies event helped me get involved with the community more. Everyone has something to share that may benefit other users.” 

 

Population Health Sciences 402 E. 67th St. New York, NY 10065 Phone: (646) 962-8078