Are you the publisher? Claim or contact us about this channel

Embed this content in your HTML


Report adult content:

click to rate:

Account: (login)

More Channels


Channel Catalog

Channel Description:

Data Carpentry is non-profit organization that develops and provides data skills training to researchers.
    0 0

    In November and December of 2017, we conducted a series of interviews with lesson Maintainers. You can read the summary and see a link to the full report on this blog post. A major theme that emerged from the interviews was that the Maintainers’ jobs were sometimes difficult because it was not always clear what was expected from them or how much authority they have on deciding what can be changed in the content of the lessons. To begin to address these issues, we developed an onboarding for new Maintainers to clarify the role and responsibilities of Maintainers. We also created Curriculum Advisory Committees for the different curricula, to offer high-level guidance on lesson structure and technology choices.

    As our Maintainers have told us, being a Maintainer is a rewarding experience. Maintainers are often the first point of contact for new contributors with our community after going through instructor training. We encourage new Instructors to contribute to our lessons as part of the checkout process. We value the perspective that fresh eyes can bring to improve our lessons and value the opportunity for contributors to experience how powerful collaborative lesson development can be. Maintainers receive pull requests and issues that touch many aspects of our lessons. Some are small and easy to deal with (e.g., fixing a typo), others suggest significant changes to the structure of the lesson or the tools that are used, while others provide comments, suggestions, feedback or ideas for discussion.

    The number and diversity of issues and pull requests that Maintainers receive can sometimes be overwhelming. Both Maintainers and contributors have said that better issue labeling would make it easier to maintain and contribute to lessons. Being able to categorize contributions would help Maintainers to think through the type of issues being reported, and allow them to identify suitable next steps to address them. Issue labels are also useful for facilitating communication among Maintainers.

    Especially for new contributors, issue labels can make contribution easier by signaling what is available to work on, and the type of expertise needed.

    The default set of labels provided by GitHub is not well-suited for our purpose, and Maintainers have expressed the need to have more options to describe the type of issues. For instance, the “bug” label is appropriate for a repository that only contains code, but our repositories contain both code and text and what qualifies as a bug is not always obvious. We also need to accommodate discussions that often take place as issues, and with other situations with which the standard set of GitHub label doesn’t deal well.

    In the past, have modified the default set of labels to include the following:: “bug”, “discussion”, “enhancement”, “help-wanted”, “instructor-training”, “newcomer-friendly”, “question”, “template-and-tools”, “work-in-progress”. However, while these labels are better suited to our lessons than the default set of labels provided by GitHub, use of these labels has not been standard across all lesson repos, with many repos introducing new labels. This indicates a need for a more robust set of labels to cover different scenarios faced in our lessons.

    Beyond the type of issues, we also want to signal the progress status to both contributors and Maintainers. Contributors should be able to tell from the list of issues the ones that are available to be worked on, and Maintainers should be able to identify issues that are being worked on by contributors. For this reason, we worked with the Maintainer community to propose a new set of issue labels, including two main categories: the type labels, and the status labels. We include these words as prefixes so Maintainers can easily filter on them when assigning them to issues.

    Maintainers have also provided feedback that it is complicated to assess prioritization and difficulty of issues. We proposed a single label for each category: one to signal issues that need to fix as soon as possible (“high priority”), and one to signal that a first-time contributor can address it (we’ll use “help wanted”).

    With the Maintainers groupBased on the initial feedback from Maintainers, looking at how other organizations that deal with many contributions are using GitHub labels, we developed a set of issue labels to test, along with definitions for each label. We wrote a proposal, and requested feedback from the community on this document as well as during our monthly Maintainers call. Maintainers from five lesson repositories volunteered to test these labels:

    The feedback from Maintainers of these pilot repositories was positive, and the broader Maintainer community commented extensively on the proposal. Comments revolved around the following issues:

    • Too many labels, especially in the “type” category, leading to too many colors;
    • GitHub started to highlight on their website issues that have the “help wanted” and “good first issue”, not using these labels would mean we would not take advantage of these features;
    • We included a “status:completed” label that was viewed as redundant with simply closing the issue;
    • The distinction between “needs contributor” and “help wanted” was not clear;
    • The definition of some labels needed to be clarified.

    We’re very grateful to the Maintainer community for their thoughtful feedback on this proposal. . With the comments that we received, and the testing that was done by Maintainers for the five pilot repositories, we gained valuable information about the usefulness of these issue labels. We are now ready to move into a beta-test. We will test these labels for at least one month, and will solicit feedback from Maintainers and survey how they are being used across our repositories during this time. This information will be used to guide any modifications to the issue labels and ensure that they are maximally useful to our Maintainer and contributor communities.

    For more detailed information about the labels and their definitions in available in The Carpentries handbook.

    The goal for these new issue labels is to provide tools and options to make the Maintainer role easier and help new contributors know where they can be more useful. With these issue tags, when a contributor opens the issue list for a lesson, they’ll know which issues can be addressed or are already being worked on. To this end, we recommend that each issue be labeled with both a “type” and a “status” issue, and that they are updated as work on the issue progresses.

    Thank you to all the Maintainers who have tested, reviewed, and modified the initial proposed set of labels. We hope they will make lesson maintenance and contributions easier, and ultimately improve the quality of our teaching materials. As with everything in The Carpentries, the process of deciding how to label our issues is iterative and open to changes based on feedback from community members. Let us know how they work for you! If you have comments or suggestions about issue labeling for our lessons, please add your thoughts to this issue.

    0 0
  • 04/10/18--17:00: Launching our New Handbook
  • As The Carpentries, we’re excited to announce that we have consolidated and updated many materials and resources to more easily share them online and be a community resource.

    Today we are launching an all-new The Carpentries Handbook. We will also be tweeting regularly through a new Carpentries Twitter account.

    The Carpentries Handbook

    We are excited to release our new Carpentries Handbook! Historically, information and resources have been spread across various websites, Google docs, GitHub repos, and more. We now have a one-stop shop that consolidates all these resources. In one place, you can now find information on how to run a workshop, how to develop and maintain lessons, and how to participate in an instructor training event. You’ll also learn about getting the word out about Carpentries activities through our communication channels, and how to get involved in our global community. Many, many thanks to all the community members who helped develop this site.

    We welcome everyone’s feedback on this Handbook. Feel free to submit issues or pull requests on this GitHub repo.

    The Carpentries Twitter

    We also will be regularly tweeting from our new The CarpentriesTwitter account from now on. Data and Software Carpentry-specific messages will still be tweeted from the individual Twitter accounts, and people will most likely tweet the handles of the individual Carpentries when teaching workshops. People are welcome to use the Software Carpentry, Data Carpentry, and The Carpentries Twitter handles in whatever combination that suits them.

    Please take a look at all our new material and let us know what you think. You can comment via Twitter, Slack, or Facebook, but since issues are less ephemeral than a Tweet, raising an issue or submitting a pull request to the Handbook repo may work best so we can have a public discussion about what still needs doing.

    0 0

    We are excited to announce that Chris Erdmann has been hired as the Library Carpentry Community and Development Director starting May 4, 2018.

    Chris has worked in libraries for more than 21 years to integrate data management and workflows in database and library systems. Through training, consulting and tool development to build programs, he has tried to empower people in research and library communities to work effectively with data. Chris received his MLIS at the University of Washington iSchool while working at the University’s Technology Transfer Office, where he helped automate workflows and develop the unit’s web presence and analytics. He spent roughly ten years working alongside astronomers at the European Southern Observatory (ESO) and Harvard-Smithsonian Center for Astrophysics to advance library data-mining and linking services, e.g. ESO Telescope Bibliography. Also during this time, he led an experimental training series called Data Scientist Training for Librarians (DST4L) geared towards teaching librarians data-savvy skills to help transform their library services to meet the needs of their research communities. He recently joined the Library Carpentry governance group.

    He is a co-author with Matt Burton, Liz Lyon, and Bonnie Tijerina on the recent report Shifting to Data Savvy: The Future of Data Science In Libraries, where Library Carpentry and The Carpentries are highlighted as a necessary next step for libraries to advance their research services.

    Chris will be working with the Library Carpentry community and The Carpentries to start mapping out the infrastructure for growing the community, formalizing lesson development processes, expanding its pool of instructors, and inspiring more instructor trainers to meet the demand for Library Carpentry workshops around the globe and thus reach new regions and communities.

    This new position is funded by IMLS and hosted by the University of California Curation Center (UC3), the digital curation program of the California Digital Library (CDL). It is intended to support the work of the Library Carpentry governance committee on streamlining operations with The Carpentries, determining standard curriculum, growing instructor training for librarians and planning for community events like the upcoming Mozilla Sprint to update Library Carpentry materials. Chris will be helping to manage the sprint work in the northern hemisphere.

    Chris is excited about advancing the profession and sees the Library Carpentry and The Carpentries communities as the perfect catalyst to do that. He is on Twitter as @libcce, on GitHub and on LinkedIn, and we’re very excited to welcome Chris to this role!

    For more information on Library Carpentry: Follow @libcarpentry on Twitter. For more information on UC3 and California Digital Library: Follow @caldiglib and @UC3CDL on Twitter.

    0 0

    The first inaugural CarpentryCon is less than 50 days away! The taskforce is diligently working to make sure all t’s are crossed and I’s are dotted to ensure that the Community enjoys a wonderful un-conference. While doing so, we have realized that we need one more thing…YOU!!!

    Have you had a desire to get involved with the planning of CarpentryCon and did not have the time? Or maybe, you felt as though you did not know where you could be most valued? Or maybe, you thought the taskforce had everything under control and did not need your help?

    If you had any of those thoughts, I’m happy to tell you those are all misconceptions. While the planning is well underway, there are areas that can use a few additional hands. And we would LOVE for you to get involved! You may only be able to assist for an hour, a day or possibly the entire conference. It does not matter the amount of time, if you want to help, there will be something that could use your assistance!

    Here are areas that could use YOU!

    • Pre-Conference Setup
    • Registration
    • Speakers and Workshops
    • Social Media
    • AV
    • Entertainment

    There are awesome benefits to becoming a volunteer. Here are just a few:

    • Making an impact on The Carpentries community
    • Network with The Carpentries community
    • Discounted items
    • Free items

    CarpentryCon will be a history making event for The Carpentries. We would like for as many of our community members to be a part of this great event. If you would like to get involved, please send an email to receive more information.

    We look forward to seeing you in Dublin, Ireland!

    0 0

    In December 2017, we made a call for community members to contribute to two new sets of Data Carpentry lessons, targeted towards researchers working with geospatial data or survey data for the social sciences.

    There was overwhelming interest from the community in working to develop and publish these two curricula. In the past few months, six new Maintainers for the Geospatial lessons, and five new Maintainers for the Social Sciences lessons have gone through Maintainer onboarding and begun to work with their lessons. Please join our community in welcoming Chris Prener, Geoff LaFlair, Peter Smyth, Juan Fung, Stephen Childs, Tyson Swetnam, Lauren O’Brien, Janani Selvaraj, and Lachlan Deer as new Maintainers on these lessons (Chris Prener and Juan Fung will serve as Maintainers for both the Geospatial and the Social Sciences lessons), as well as Leah Wasser and Joseph Stachelek, who will be continuing on as Maintainers for the Geospatial lessons.

    In addition to new Maintainers, a set of Curriculum Advisors has also been assembled for each of these new curricula. Curriculum Advisors help to provide strategic oversight, vision, and leadership for a particular set of lessons to guide the overall development of the lessons. Please join us in welcoming Arindam Basu, Chris Prener, Geoff LaFlair, Katie Metzler, Rachel Gibson, Reka Solymosi, Peter Smyth, Scott Peterson, and Stephen Childs as the Curriculum Advisory Committee for the Social Sciences lessons and Anne Fouilloux, Arthur Endsley, Chris Prener, Jeff Hollister, Joseph Stachelek, Leah Wasser, Michael Sumner, Michele Tobias, and Stace Maples as the Curriculum Advisory Committee for the Geospatial lessons. Curriculum Advisors meet twice yearly and advise the curriculum’s Maintainers in overall strategy for the lessons. Meeting minutes for Curriculum Advisory meetings are available in the group’s GitHub repo.

    Thanks to the phenomenal support from the Maintainers and Curriculum Advisors for these lessons, as well as the support of the Carpentry community during the recent Bug BBQ, these lessons are on track for publication. The Social Sciences lessons are scheduled for release at the end of April and the Geospatial lessons will be complete in June.

    We still have work to do before publication! Everyone is invited to help as we enter the final stretch for preparing these lessons for their first official release. If you have a few minutes to spare, head on over to one of the lesson repositories and check out the open issues or review an existing pull request. You can also contact the lesson Maintainers on the SWC Slack channel with specific questions.

    Thank you to everyone who has participated in building these lessons up to this point. It has been a fantastic community effort. We’re excited to be releasing these lessons soon so that they can benefit researchers in the social sciences and geospatial communities.

    0 0

    Website Launch

    We are excited to announce that The Carpentries website is now live!

    The new website celebrates our merged identity as The Carpentries.

    The new website will give you access to all things ‘Carpentries’, in other words, it will give you easy access to what is common information across the merged organization. The sorts of things you will find there include our Code of Conduct, information about instructor training and assessment, a range of shared policies, including our privacy policy, details of staffing and project governance, and a whole lot more.

    The existing Data and Software Carpentry sites will remain in place alongside the new site. Since Data and Software Carpentry are ongoing lesson organizations, information related to lessons belongs on those individual sites. We will gradually take down material that is now more logically based on The Carpentries website.

    You may notice that a lot of the links on The Carpentries transfer you directly to The Carpentries Handbook that we launched last week.

    The Handbook has been enthusiastically received by our community. For those who haven’t seen it yet, find it here. The aim of the Handbook is to provide a one-stop shop for people wanting all kinds of Carpentries-related information. Information is being added and updated all the time so please let us know if there is something missing. The Handbook and the website will complement each other to cover all things Carpentries.

    Please let us know if there are errors or omissions on our new website. You can raise an issue about the website at this link, or about the Handbook at this link.

    The launch of the new website completes our transition to a new, merged, online identity as The Carpentries. Increasingly we will blog as The Carpentries, rather than as Software or Data Carpentry, so be sure to check out our new blog.

    We also have our new merged Twitter feed. Follow The Carpentries on Twitter.

    0 0


    The UF R Users Group was formed in January 2017, and since then we’ve been running a weekly “UF R Meetup”: A two-hour session consists of a 30 to 60 minute presentation/tutorials followed by an “open lab” session. The meetup is meant to be a casual, informal opportunity to learn as a community, and seek face-to-face advice.

    By the end of the second semester of running the meetup, we had identified a couple of issues:

    • The majority of the participants were either beginners or completely new to R (and programming in general).
    • As our presentations shifted to cater to new users, it became difficult to engage and entice more advanced programmers.

    In addition, our presentations on the basics of R were unstructured and constructed on-the-fly – not the best way to teach and learn R. We felt that these disconnects were making it difficult to establish a sustainable learning community.

    In January 2018, we decided to run an introductory workshop series separate from the meetup. The workshop would provide structured lessons on the basics of R and allow the meetup to cover more advanced topics. Luckily for us, The Carpentries already have well-structured lessons for these materials, and we could rely on the strong pool of Carpentry instructors at the University of Florida.

    The question then became: “Do we want to run a traditional two-day Carpentry workshop, or try something different?”. We already knew that there was interest in regular weekly meetings, and saw potential in giving access to people who could not commit to a full two-day Carpentries workshop, or people who might need a refresher even though they’ve taken the two-day workshop. So we decided to run our workshop a bit differently than normal.


    We used the Data Carpentry in Ecology curriculum as a starting point. This included Data Analysis and Visualization in R, Data Organization in Spreadsheets, and Data Cleaning with OpenRefine. The two-day workshops usually include the Data Management in SQL lesson as well, but we felt it may have been to much for learners to learn all the SQL concepts in a two-hour session. Instead we opted to create some new material centered on the join features in dplyr, which has very similar concepts. This extended naturally from the dplyr lesson, and we even titled it “Advanced Dataframe Manipulation” to reflect that.

    | Week | Lesson | redirect_from: /blog/dc-seven-weeks/ | :—– |:—-| | 1 | Intro to R| | 2 | Data Organization| | 3 | Starting with Data| | 4 | Manipulating Dataframes | | 5 | Visualizing Data | | 6 | Advanced Dataframe Manipulation | | 7 | OpenRefine |

    Besides that it was run exactly like any other Carpentry workshop. We had different instructors for each lesson, there were helpers available, we created an Etherpad for collaborative note-taking, and used red and green sticky notes for real time feedback. You can view the workshop homepage.

    How it went

    We’ve been a part of many Data Carpentry and Software Carpentry workshops here at the University of Florida, and this one went as well as any of them. Anonymous feedback at the end of each lesson was universally positive, and several participants told us in person how much they enjoyed it.

    Sticky note feedback

    We capped the elongated workshop at 40 participants and it filled fairly quickly. However, at most only 18 came to a particular session and attendance dropped over the 7 weeks.


    Several factors likely contributed to this attendance pattern. Attendance is also often low the first time a new training opportunity is offered. We also chose not to collect a registration fee, because our group is designed to be an informal alternative to other resources on our campus, including formal courses and traditional Carpentry workshops (on average there are about three Carpentry workshops each semester at UF). The lack of a financial commitment from students may have been part of the the depressed attendance. We also found that there was less interest and reduced attendance in the non-R focused lessons, as well as more interest in the tidyverse-based lessons compared to base R lessons. Scheduling conflicts also arose over the course of the series, and once someone had to miss a lesson there appeared to be a lack of motivation to continue.

    Lastly, we were very interested in how this schedule format improved access. Anecdotally several participants told us how they preferred a two-hour-a-week workshop over a full two days. In a post-workshop survey, two of the three respondents said they preferred this schedule over a two-day workshop.

    Scheduling-wise, the majority of material fit into the two-hour time slots. The exception was the “Manipulating Dataframes” lesson where we did not have time for the very last section. Luckily this fit in nicely with the “Advanced Dataframe Manipulation” lesson to fill the full two hours in week 6.

    Lessons Learned

    Overall, we feel this elongated workshop was a success and we hope to run similar ones in the future. We are encouraged by the post-survey responses, as well as the anecdotal comments from the attendees. The workshop series also provided a less time- and material-intensive opportunity for newly trained instructors to gain some teaching experience.

    There a few things we may change:

    • There were participants who felt confident not attending the first few sessions and only attended specific ones such as Data Manipulation or Visualization. This likely contributed to, at most, 18 of the 40 sign ups attending. In the beginning we encouraged people to attend every lesson but did not enforce this. In the future, we would consider session-specific sign-ups where participants can express interest in any or all of the sessions based on their needs.

    • We found it helpful to do a short recap at the beginning of each session to quickly summarize the primary lessons from the prior week.

    • We collected and re-distributed the post-it notes every week so as not to waste them, though some eventually lost their stickiness. In the end we used up roughly three-quarters of a single stack for each color.

    0 0

    We are excited to announce the initial release of a Data Carpentry Social Sciences Curriculum. This is the first Data Carpentry Curriculum to be released targeted towards researchers outside of the life sciences and provides an opportunity to reach out to new communities.

    Peter Smyth has assembled the initial content for these lessons with the guidance of Rachel Gibson, Professor of Political Science at the Cathie Marsh Institute of Social Research, University of Manchester, UK. It was polished during the April 2018 Bug BBQ, and the finishing was done by the lesson Maintainers in coordination with Carpentries staff.

    This curriculum aims at teaching similar skills like the ones covered in the Ecology curriculum. It is focused on best practices for working with rectangular and tidy data. The curriculum covers data organization in spreadsheets, data cleaning with OpenRefine, as well as data manipulation and visualization with R. There are also lessons on SQL and Python that are available but are not part of this initial release.

    As with other materials for Data Carpentry, the same dataset is used across all the lessons. Here, we use a simplified version of a research datasets generated by the SAFI (Studying African Farmer-led Irrigation) research project. This dataset is available on Figshare and is survey data relating to households and agriculture in Tanzania and Mozambique. The survey data was collected through interviews conducted between November 2016 and June 2017 and covered such things as household features (e.g., construction materials used, number of household members), agricultural practices (e.g., water usage), assets (e.g., number and types of livestock) and details about the household members.

    If this curriculum were a piece of software, we would say it is in “beta”. The authors of this curriculum have taught it, and it is now ready to be taught by other members of The Carpentries community. We are interested in your feedback to improve it. We want to ensure it meets the needs and matches the skills that Social Scientists want to acquire when working with data. If you are a social scientist (or studying to become one), please review the lessons and provide us with your feedback. If you are interested in teaching one of the first Social Sciences Data Carpentry workshops, let us know by filling this form.

    0 0
  • 08/28/18--17:00: Geospatial Launch
  • The long-awaited Data Carpentry curriculum for working with Geospatial data is now ready to teach! As with all our newly developed curricula, these lessons are now in ‘beta’. We are actively promoting workshops and collecting information from those workshops to improve these lessons as they are taught more broadly and in different contexts. We will also be onboarding Instructors to prepare them to teach these new lessons. Keep reading for more details.

    So what’s in the material?

    This R-based geospatial workshop will introduce project organisation and management for spatial data, cover data structures and storage and transfer formats, teach the creation of summary statistics and publication-quality graphics, and help users work with and plot vector and raster-format spatial data in R. Find more information on the workshop homepage.

    Want to teach this material?

    We will be onboarding Instructors to prepare them to teach these new lessons. We also want to run some pilot workshops so that we can assess what we have got right, and what might still need some tweaking.

    Lesson background

    These lessons were initially developed in 2016 through a hackathon held in conjunction with the National Ecological Observatory Network (NEON). Hackathon participants included the following people who became the initial authors of the lessons: Leah A. Wasser - University of Colorado, Megan A. Jones - NEON, Zack Brym - University of Florida, Kristina Riemer - University of Florida, Jason Williams - Cold Spring Harbor Lab, Jeff Hollister - US Environmental Protection Agency, Mike Smorul - SESYNC, Joseph Stachelek - Michigan State University, Marissa Guarinello - NKN/University of Idaho, Jonah Duckles - The Carpentries, Keely Roth - University of California at Davis, Mike Alonzo - NASA Goddard, Ben Best - Duke / UCSB, Matt Kwit - Duke, Tracy Teal - The Carpentries, Kaitlin Stack Whitney - University of Wisconsin-Madison, Dave Roberts - Montana State, Courtney Soderberg - Center for Open Science, Sean Barberie - University of Alaska Fairbanks. The workshop materials were piloted in March 2016, and the lesson release has been much anticipated by Carpentries’ community members. Most of the data used in the workshop has been sourced from NEON ( You can see other NEON tutorials for advanced GIS topics here (

    Recent community involvement

    Recent developments in these materials have been led by a highly active and engaged group of Maintainers (Lachlan Deer, Juan Fung, Lauren O’Brien, Chris Prener, Janani Selvaraj, Joseph Stachelek , Tyson Swetnam, Jane Wyngaard) and Curriculum Advisors (Anne Fouilloux - University of Oslo, Arthur Endsley - University of Michigan, Chris Prener - St Louis University, Jeff Hollister - US Environmental Protection Agency, Joseph Stachelek - Michigan State University, Leah Wasser - University of Colorado, Michael Sumner - Australian Antarctic Data Centre, Michele Tobias- University of California, Davis, Stace Maples - Stanford University).

    If you are interested in the direction and decisions the Curriculum Advisors took for the lesson, you can see their minutes. The finalisation of many parts of the material was down to a big burst of work during the April 2018 Bug BBQ. Thanks to all the community members who took part.

    Special thanks go to: Lauren O’Brien for re-organizing the Geospatial Project Organization and Management lesson to line up with changes to the rest of the curriculum. Lachlan Deer, Juan Fung, Joseph Stachelek, Anne Fouilloux and Justin Millar for converting all of the episodes to ggplot from base R graphics. Joseph Stachelek for transferring the lessons to the current lesson template. Chris Prener for updating the installation instructions and creating a Docker image for the lessons. Leah A. Wasser and Megan A. Jones for providing an introduction to the data used in the lesson. Michael Culshaw-Maurer, Anne Fouilloux, Michael Heeremans, Megan A. Jones, Natalie Robinson, Joseph Stachelek, Tracy Teal, Michele Tobias, and Leah A. Wasser for teaching pilot workshops. NEON for collecting and sharing the data, organizing and co-hosting the 2016 Hackathon, and providing staff time to produce these lessons.

    Teach or host a Geospatial workshop!

    Want to get involved with the Geospatial materials? Get badged to teach the Geospatial lessons. Sign up for onboarding using this Etherpad. Onboarding sessions also appear on the Community Calendar. Request a Geospatial beta pilot workshop at your institution using this form. Self-organise a Geospatial beta pilot workshop at your institution. Use our self-organized workshop checklist to plan your workshop..

    0 0
  • 09/03/18--17:00: Atmos Ocean Launch
  • Back in late 2012, I was a couple of years into my first job out of college. My undergraduate studies had left me somewhat underprepared for the coding associated with analyzing climate model data for a national science organization, so I was searching online for assistance with Python programming. I stumbled upon the website of an organization called Software Carpentry, which at the time was a relatively small group of volunteers running two-day scientific computing “bootcamps” for researchers. I reached out to ask if they’d be interested in running a workshop alongside the 2013 Annual Conference of the Australian Meteorological and Oceanographic Society (AMOS), and to my surprise Greg Wilson - the co-founder of the organization - flew out to Australia to teach at our event in Melbourne and another in Sydney (the first ever bootcamps outside of North America and Europe). I trained up as an instructor soon after, and from 2014-2017 I hosted Software Carpentry workshops alongside the AMOS conference, as well as other ad hoc workshops in various meteorology and oceanography departments.

    While these workshops were very popular and well received (Software Carpentry workshops always are), in the back of my mind I wanted to have a go at running a workshop designed specifically for atmosphere and ocean scientists. Instead of teaching generic skills in the hope that people would figure out how to apply them in their own context, I wanted to cut out the middle step and run a workshop in the atmosphere and ocean science context. This idea of discipline (or data-type) specific workshops was the driving force behind the establishment of Data Carpentry, so this year with their assistance I’ve developed lesson materials for a complete one-day workshop:

    The workshop centers around the task of writing a Python script that calculates and plots the seasonal rainfall climatology (i.e. the average rainfall) from the output from any arbitrary climate model. Such data is typically stored in netCDF file format and follows a strict “climate and forecasting” metadata convention. Along the way, we learn about the PyAOS stack (i.e. the ecosystem of libraries used in the atmosphere and ocean sciences), how to manage and share a software environment using conda, how to write modular/reusable code, how to write scripts that behave like other command line programs, version control, defensive programming strategies and how to capture and record the provenance of the data files and figures that we produce.

    I’ve run the workshop twice now (at the 2018 AMOS Conference in Sydney and at Woods Hole Oceanographic Institution last month), which means I’ve completed the alpha stage of the Data Carpentry lesson development cycle. Moving from the alpha to beta stage involves having people other than me teach, which is where you come in. If you’re a qualified Carpentries instructor and would be interested in teaching the lessons (some experience with the netCDF file format and xarray Python library is useful), please get in touch with either myself or Francois Michonneau (Curriculum Development Lead for Data Carpentry). You can also request a workshop at your institution by contacting us and we’ll reach out to instructors. There is no fee for a pilot workshop, but you would need to cover travel expenses for instructors. I’d also be happy to hear any general feedback about the lesson materials at the associated GitHub repository.