Are you the publisher? Claim or contact us about this channel

Embed this content in your HTML


Report adult content:

click to rate:

Account: (login)

More Channels

Channel Catalog

Channel Description:

Data Carpentry is non-profit organization that develops and provides data skills training to researchers.
    0 0

    Remind me…

    When is the Bug BBQ?

    It starts on April 12th, 9am Eastern (US) and ends on April 13th, 5pm Pacific (US) (click on the links to see these dates and times for your timezone).

    What is the Bug BBQ? Why are we doing one?

    You can read more about this in the announcement post.

    What is the website with all the info?

    I want to contribute to the lessons during the Bug BBQ!

    We created a guide that outlines how to contribute to the Bug BBQ. There are many ways to get involved, the issues to address are as varied as the skills they require, and the time it takes to fix them.

    As indicated in the guide, we strongly recommend that you leave a comment on the issue you are going to address before starting to work on it. This ensures that no work gets duplicated.

    We will use Slack to communicate and coordinate. Feel free to ask questions in the #general channel or in the channel associated with the lesson you are working on. The Bug BBQ website lists all the Slack channels for the lessons involved in the Bug BBQ.

    I’m a maintainer, what do I need to know before the Bug BBQ?

    We suggest you coordinate with your co-maintainers to decide who will be online when. If at least one member of your lesson can check-in every few hours to review pull requests, provide feedback on the issues that are being opened, and monitor the Slack channels that would be great. If you can spread the load of this work among the maintainers that makes it easier. We prepared a spreadsheet for you to complete so that you can indicate your availability. This spreadsheet has tabs for Geospatial and Social Sciences (which are preparing for their first publication) and another tab for the other lessons that are part of the event.

    We also suggest that you prepare your repository for the Bug BBQ. Close stale issues and pull requests, assign at least one of each “type” and “status” GitHub label to the relevant issues, open issues for things you know need to be addressed. Don’t forget to add the “good first issue” and “help wanted” labels to the relevant issues to make it easier for new contributors to identify things they can work on relatively easily.

    Don’t hesitate to contact François or Erin, or leave a message on Slack if you have any questions. If there is anything we can do to make it easier for you to participate in the Bug BBQ, let us know.

    I’m a maintainer, what do I need to do during the Bug BBQ?

    During the Bug BBQ, as a Maintainer, you will provide feedback on the issues and pull requests that will come to your repository. You will assign GitHub labels to the issues that are being opened and update them as needed. You will answer questions contributors may have about the lesson.

    This is a big job! Please coordinate with your co-Maintainers to decide who will do what, and don’t hesitate to contact us if we can help provide support for your team.

    I’m a maintainer, and I want my lesson to be involved in the Bug BBQ. Is it too late?

    No! Let François know and we’ll add your lesson to the Bug BBQ website.

    I want to organize a local event, what do I need to do?

    Local events are a great way to build and strengthen Carpentries communities. It is also a great opportunity to teach (and learn) how to use Git and GitHub for collaborative lesson development. You need a room with power outlets and an internet connection and a few hours of availability (BBQ optional).

    If you want to organize a local event, fill out a GitHub issue.

    redirect_from: /blog/bug-bbq-how-to-be-involved/

    The Carpentries lessons belong to the community. A Bug BBQ is an opportunity to work collectively on the lessons to make them better. We look forward to seeing all of your contributions!

    0 0

    Did you miss the deadline to join a mentoring group? Do not fear, there are still openings for mentees to join groups in the following timezones:

    To join, fill out this application.

    Mentoring groups are beneficial to participants because group members are able to focus on specific goals, including teaching their first workshop and developing new lesson contribution material. Being a part of a group that addresses something important to you is both powerful and enjoyable. You do not want to miss out on the Carpentries mentoring opportunities. Join a mentoring group today!

    0 0

    In November and December of 2017, we conducted a series of interviews with lesson Maintainers. You can read the summary and see a link to the full report on this blog post. A major theme that emerged from the interviews was that the Maintainers’ jobs were sometimes difficult because it was not always clear what was expected from them or how much authority they have on deciding what can be changed in the content of the lessons. To begin to address these issues, we developed an onboarding for new Maintainers to clarify the role and responsibilities of Maintainers. We also created Curriculum Advisory Committees for the different curricula, to offer high-level guidance on lesson structure and technology choices.

    As our Maintainers have told us, being a Maintainer is a rewarding experience. Maintainers are often the first point of contact for new contributors with our community after going through instructor training. We encourage new Instructors to contribute to our lessons as part of the checkout process. We value the perspective that fresh eyes can bring to improve our lessons and value the opportunity for contributors to experience how powerful collaborative lesson development can be. Maintainers receive pull requests and issues that touch many aspects of our lessons. Some are small and easy to deal with (e.g., fixing a typo), others suggest significant changes to the structure of the lesson or the tools that are used, while others provide comments, suggestions, feedback or ideas for discussion.

    The number and diversity of issues and pull requests that Maintainers receive can sometimes be overwhelming. Both Maintainers and contributors have said that better issue labeling would make it easier to maintain and contribute to lessons. Being able to categorize contributions would help Maintainers to think through the type of issues being reported, and allow them to identify suitable next steps to address them. Issue labels are also useful for facilitating communication among Maintainers.

    Especially for new contributors, issue labels can make contribution easier by signaling what is available to work on, and the type of expertise needed.

    The default set of labels provided by GitHub is not well-suited for our purpose, and Maintainers have expressed the need to have more options to describe the type of issues. For instance, the “bug” label is appropriate for a repository that only contains code, but our repositories contain both code and text and what qualifies as a bug is not always obvious. We also need to accommodate discussions that often take place as issues, and with other situations with which the standard set of GitHub label doesn’t deal well.

    In the past, have modified the default set of labels to include the following:: “bug”, “discussion”, “enhancement”, “help-wanted”, “instructor-training”, “newcomer-friendly”, “question”, “template-and-tools”, “work-in-progress”. However, while these labels are better suited to our lessons than the default set of labels provided by GitHub, use of these labels has not been standard across all lesson repos, with many repos introducing new labels. This indicates a need for a more robust set of labels to cover different scenarios faced in our lessons.

    Beyond the type of issues, we also want to signal the progress status to both contributors and Maintainers. Contributors should be able to tell from the list of issues the ones that are available to be worked on, and Maintainers should be able to identify issues that are being worked on by contributors. For this reason, we worked with the Maintainer community to propose a new set of issue labels, including two main categories: the type labels, and the status labels. We include these words as prefixes so Maintainers can easily filter on them when assigning them to issues.

    Maintainers have also provided feedback that it is complicated to assess prioritization and difficulty of issues. We proposed a single label for each category: one to signal issues that need to fix as soon as possible (“high priority”), and one to signal that a first-time contributor can address it (we’ll use “help wanted”).

    With the Maintainers groupBased on the initial feedback from Maintainers, looking at how other organizations that deal with many contributions are using GitHub labels, we developed a set of issue labels to test, along with definitions for each label. We wrote a proposal, and requested feedback from the community on this document as well as during our monthly Maintainers call. Maintainers from five lesson repositories volunteered to test these labels:

    The feedback from Maintainers of these pilot repositories was positive, and the broader Maintainer community commented extensively on the proposal. Comments revolved around the following issues:

    • Too many labels, especially in the “type” category, leading to too many colors;
    • GitHub started to highlight on their website issues that have the “help wanted” and “good first issue”, not using these labels would mean we would not take advantage of these features;
    • We included a “status:completed” label that was viewed as redundant with simply closing the issue;
    • The distinction between “needs contributor” and “help wanted” was not clear;
    • The definition of some labels needed to be clarified.

    We’re very grateful to the Maintainer community for their thoughtful feedback on this proposal. . With the comments that we received, and the testing that was done by Maintainers for the five pilot repositories, we gained valuable information about the usefulness of these issue labels. We are now ready to move into a beta-test. We will test these labels for at least one month, and will solicit feedback from Maintainers and survey how they are being used across our repositories during this time. This information will be used to guide any modifications to the issue labels and ensure that they are maximally useful to our Maintainer and contributor communities.

    For more detailed information about the labels and their definitions in available in The Carpentries handbook.

    The goal for these new issue labels is to provide tools and options to make the Maintainer role easier and help new contributors know where they can be more useful. With these issue tags, when a contributor opens the issue list for a lesson, they’ll know which issues can be addressed or are already being worked on. To this end, we recommend that each issue be labeled with both a “type” and a “status” issue, and that they are updated as work on the issue progresses.

    Thank you to all the Maintainers who have tested, reviewed, and modified the initial proposed set of labels. We hope they will make lesson maintenance and contributions easier, and ultimately improve the quality of our teaching materials. As with everything in The Carpentries, the process of deciding how to label our issues is iterative and open to changes based on feedback from community members. Let us know how they work for you! If you have comments or suggestions about issue labeling for our lessons, please add your thoughts to this issue.

    0 0
  • 04/10/18--17:00: Launching our New Handbook
  • As The Carpentries, we’re excited to announce that we have consolidated and updated many materials and resources to more easily share them online and be a community resource.

    Today we are launching an all-new The Carpentries Handbook. We will also be tweeting regularly through a new Carpentries Twitter account.

    The Carpentries Handbook

    We are excited to release our new Carpentries Handbook! Historically, information and resources have been spread across various websites, Google docs, GitHub repos, and more. We now have a one-stop shop that consolidates all these resources. In one place, you can now find information on how to run a workshop, how to develop and maintain lessons, and how to participate in an instructor training event. You’ll also learn about getting the word out about Carpentries activities through our communication channels, and how to get involved in our global community. Many, many thanks to all the community members who helped develop this site.

    We welcome everyone’s feedback on this Handbook. Feel free to submit issues or pull requests on this GitHub repo.

    The Carpentries Twitter

    We also will be regularly tweeting from our new The CarpentriesTwitter account from now on. Data and Software Carpentry-specific messages will still be tweeted from the individual Twitter accounts, and people will most likely tweet the handles of the individual Carpentries when teaching workshops. People are welcome to use the Software Carpentry, Data Carpentry, and The Carpentries Twitter handles in whatever combination that suits them.

    Please take a look at all our new material and let us know what you think. You can comment via Twitter, Slack, or Facebook, but since issues are less ephemeral than a Tweet, raising an issue or submitting a pull request to the Handbook repo may work best so we can have a public discussion about what still needs doing.

    0 0

    We are excited to announce that Chris Erdmann has been hired as the Library Carpentry Community and Development Director starting May 4, 2018.

    Chris has worked in libraries for more than 21 years to integrate data management and workflows in database and library systems. Through training, consulting and tool development to build programs, he has tried to empower people in research and library communities to work effectively with data. Chris received his MLIS at the University of Washington iSchool while working at the University’s Technology Transfer Office, where he helped automate workflows and develop the unit’s web presence and analytics. He spent roughly ten years working alongside astronomers at the European Southern Observatory (ESO) and Harvard-Smithsonian Center for Astrophysics to advance library data-mining and linking services, e.g. ESO Telescope Bibliography. Also during this time, he led an experimental training series called Data Scientist Training for Librarians (DST4L) geared towards teaching librarians data-savvy skills to help transform their library services to meet the needs of their research communities. He recently joined the Library Carpentry governance group.

    He is a co-author with Matt Burton, Liz Lyon, and Bonnie Tijerina on the recent report Shifting to Data Savvy: The Future of Data Science In Libraries, where Library Carpentry and The Carpentries are highlighted as a necessary next step for libraries to advance their research services.

    Chris will be working with the Library Carpentry community and The Carpentries to start mapping out the infrastructure for growing the community, formalizing lesson development processes, expanding its pool of instructors, and inspiring more instructor trainers to meet the demand for Library Carpentry workshops around the globe and thus reach new regions and communities.

    This new position is funded by IMLS and hosted by the University of California Curation Center (UC3), the digital curation program of the California Digital Library (CDL). It is intended to support the work of the Library Carpentry governance committee on streamlining operations with The Carpentries, determining standard curriculum, growing instructor training for librarians and planning for community events like the upcoming Mozilla Sprint to update Library Carpentry materials. Chris will be helping to manage the sprint work in the northern hemisphere.

    Chris is excited about advancing the profession and sees the Library Carpentry and The Carpentries communities as the perfect catalyst to do that. He is on Twitter as @libcce, on GitHub and on LinkedIn, and we’re very excited to welcome Chris to this role!

    For more information on Library Carpentry: Follow @libcarpentry on Twitter. For more information on UC3 and California Digital Library: Follow @caldiglib and @UC3CDL on Twitter.

    0 0

    The first inaugural CarpentryCon is less than 50 days away! The taskforce is diligently working to make sure all t’s are crossed and I’s are dotted to ensure that the Community enjoys a wonderful un-conference. While doing so, we have realized that we need one more thing…YOU!!!

    Have you had a desire to get involved with the planning of CarpentryCon and did not have the time? Or maybe, you felt as though you did not know where you could be most valued? Or maybe, you thought the taskforce had everything under control and did not need your help?

    If you had any of those thoughts, I’m happy to tell you those are all misconceptions. While the planning is well underway, there are areas that can use a few additional hands. And we would LOVE for you to get involved! You may only be able to assist for an hour, a day or possibly the entire conference. It does not matter the amount of time, if you want to help, there will be something that could use your assistance!

    Here are areas that could use YOU!

    • Pre-Conference Setup
    • Registration
    • Speakers and Workshops
    • Social Media
    • AV
    • Entertainment

    There are awesome benefits to becoming a volunteer. Here are just a few:

    • Making an impact on The Carpentries community
    • Network with The Carpentries community
    • Discounted items
    • Free items

    CarpentryCon will be a history making event for The Carpentries. We would like for as many of our community members to be a part of this great event. If you would like to get involved, please send an email to receive more information.

    We look forward to seeing you in Dublin, Ireland!

    0 0

    In December 2017, we made a call for community members to contribute to two new sets of Data Carpentry lessons, targeted towards researchers working with geospatial data or survey data for the social sciences.

    There was overwhelming interest from the community in working to develop and publish these two curricula. In the past few months, six new Maintainers for the Geospatial lessons, and five new Maintainers for the Social Sciences lessons have gone through Maintainer onboarding and begun to work with their lessons. Please join our community in welcoming Chris Prener, Geoff LaFlair, Peter Smyth, Juan Fung, Stephen Childs, Tyson Swetnam, Lauren O’Brien, Janani Selvaraj, and Lachlan Deer as new Maintainers on these lessons (Chris Prener and Juan Fung will serve as Maintainers for both the Geospatial and the Social Sciences lessons), as well as Leah Wasser and Joseph Stachelek, who will be continuing on as Maintainers for the Geospatial lessons.

    In addition to new Maintainers, a set of Curriculum Advisors has also been assembled for each of these new curricula. Curriculum Advisors help to provide strategic oversight, vision, and leadership for a particular set of lessons to guide the overall development of the lessons. Please join us in welcoming Arindam Basu, Chris Prener, Geoff LaFlair, Katie Metzler, Rachel Gibson, Reka Solymosi, Peter Smyth, Scott Peterson, and Stephen Childs as the Curriculum Advisory Committee for the Social Sciences lessons and Anne Fouilloux, Arthur Endsley, Chris Prener, Jeff Hollister, Joseph Stachelek, Leah Wasser, Michael Sumner, Michele Tobias, and Stace Maples as the Curriculum Advisory Committee for the Geospatial lessons. Curriculum Advisors meet twice yearly and advise the curriculum’s Maintainers in overall strategy for the lessons. Meeting minutes for Curriculum Advisory meetings are available in the group’s GitHub repo.

    Thanks to the phenomenal support from the Maintainers and Curriculum Advisors for these lessons, as well as the support of the Carpentry community during the recent Bug BBQ, these lessons are on track for publication. The Social Sciences lessons are scheduled for release at the end of April and the Geospatial lessons will be complete in June.

    We still have work to do before publication! Everyone is invited to help as we enter the final stretch for preparing these lessons for their first official release. If you have a few minutes to spare, head on over to one of the lesson repositories and check out the open issues or review an existing pull request. You can also contact the lesson Maintainers on the SWC Slack channel with specific questions.

    Thank you to everyone who has participated in building these lessons up to this point. It has been a fantastic community effort. We’re excited to be releasing these lessons soon so that they can benefit researchers in the social sciences and geospatial communities.

    0 0

    Website Launch

    We are excited to announce that The Carpentries website is now live!

    The new website celebrates our merged identity as The Carpentries.

    The new website will give you access to all things ‘Carpentries’, in other words, it will give you easy access to what is common information across the merged organization. The sorts of things you will find there include our Code of Conduct, information about instructor training and assessment, a range of shared policies, including our privacy policy, details of staffing and project governance, and a whole lot more.

    The existing Data and Software Carpentry sites will remain in place alongside the new site. Since Data and Software Carpentry are ongoing lesson organizations, information related to lessons belongs on those individual sites. We will gradually take down material that is now more logically based on The Carpentries website.

    You may notice that a lot of the links on The Carpentries transfer you directly to The Carpentries Handbook that we launched last week.

    The Handbook has been enthusiastically received by our community. For those who haven’t seen it yet, find it here. The aim of the Handbook is to provide a one-stop shop for people wanting all kinds of Carpentries-related information. Information is being added and updated all the time so please let us know if there is something missing. The Handbook and the website will complement each other to cover all things Carpentries.

    Please let us know if there are errors or omissions on our new website. You can raise an issue about the website at this link, or about the Handbook at this link.

    The launch of the new website completes our transition to a new, merged, online identity as The Carpentries. Increasingly we will blog as The Carpentries, rather than as Software or Data Carpentry, so be sure to check out our new blog.

    We also have our new merged Twitter feed. Follow The Carpentries on Twitter.

    0 0


    The UF R Users Group was formed in January 2017, and since then we’ve been running a weekly “UF R Meetup”: A two-hour session consists of a 30 to 60 minute presentation/tutorials followed by an “open lab” session. The meetup is meant to be a casual, informal opportunity to learn as a community, and seek face-to-face advice.

    By the end of the second semester of running the meetup, we had identified a couple of issues:

    • The majority of the participants were either beginners or completely new to R (and programming in general).
    • As our presentations shifted to cater to new users, it became difficult to engage and entice more advanced programmers.

    In addition, our presentations on the basics of R were unstructured and constructed on-the-fly – not the best way to teach and learn R. We felt that these disconnects were making it difficult to establish a sustainable learning community.

    In January 2018, we decided to run an introductory workshop series separate from the meetup. The workshop would provide structured lessons on the basics of R and allow the meetup to cover more advanced topics. Luckily for us, The Carpentries already have well-structured lessons for these materials, and we could rely on the strong pool of Carpentry instructors at the University of Florida.

    The question then became: “Do we want to run a traditional two-day Carpentry workshop, or try something different?”. We already knew that there was interest in regular weekly meetings, and saw potential in giving access to people who could not commit to a full two-day Carpentries workshop, or people who might need a refresher even though they’ve taken the two-day workshop. So we decided to run our workshop a bit differently than normal.


    We used the Data Carpentry in Ecology curriculum as a starting point. This included Data Analysis and Visualization in R, Data Organization in Spreadsheets, and Data Cleaning with OpenRefine. The two-day workshops usually include the Data Management in SQL lesson as well, but we felt it may have been to much for learners to learn all the SQL concepts in a two-hour session. Instead we opted to create some new material centered on the join features in dplyr, which has very similar concepts. This extended naturally from the dplyr lesson, and we even titled it “Advanced Dataframe Manipulation” to reflect that.

    | Week | Lesson | redirect_from: /blog/dc-seven-weeks/ | :—– |:—-| | 1 | Intro to R| | 2 | Data Organization| | 3 | Starting with Data| | 4 | Manipulating Dataframes | | 5 | Visualizing Data | | 6 | Advanced Dataframe Manipulation | | 7 | OpenRefine |

    Besides that it was run exactly like any other Carpentry workshop. We had different instructors for each lesson, there were helpers available, we created an Etherpad for collaborative note-taking, and used red and green sticky notes for real time feedback. You can view the workshop homepage.

    How it went

    We’ve been a part of many Data Carpentry and Software Carpentry workshops here at the University of Florida, and this one went as well as any of them. Anonymous feedback at the end of each lesson was universally positive, and several participants told us in person how much they enjoyed it.

    Sticky note feedback

    We capped the elongated workshop at 40 participants and it filled fairly quickly. However, at most only 18 came to a particular session and attendance dropped over the 7 weeks.


    Several factors likely contributed to this attendance pattern. Attendance is also often low the first time a new training opportunity is offered. We also chose not to collect a registration fee, because our group is designed to be an informal alternative to other resources on our campus, including formal courses and traditional Carpentry workshops (on average there are about three Carpentry workshops each semester at UF). The lack of a financial commitment from students may have been part of the the depressed attendance. We also found that there was less interest and reduced attendance in the non-R focused lessons, as well as more interest in the tidyverse-based lessons compared to base R lessons. Scheduling conflicts also arose over the course of the series, and once someone had to miss a lesson there appeared to be a lack of motivation to continue.

    Lastly, we were very interested in how this schedule format improved access. Anecdotally several participants told us how they preferred a two-hour-a-week workshop over a full two days. In a post-workshop survey, two of the three respondents said they preferred this schedule over a two-day workshop.

    Scheduling-wise, the majority of material fit into the two-hour time slots. The exception was the “Manipulating Dataframes” lesson where we did not have time for the very last section. Luckily this fit in nicely with the “Advanced Dataframe Manipulation” lesson to fill the full two hours in week 6.

    Lessons Learned

    Overall, we feel this elongated workshop was a success and we hope to run similar ones in the future. We are encouraged by the post-survey responses, as well as the anecdotal comments from the attendees. The workshop series also provided a less time- and material-intensive opportunity for newly trained instructors to gain some teaching experience.

    There a few things we may change:

    • There were participants who felt confident not attending the first few sessions and only attended specific ones such as Data Manipulation or Visualization. This likely contributed to, at most, 18 of the 40 sign ups attending. In the beginning we encouraged people to attend every lesson but did not enforce this. In the future, we would consider session-specific sign-ups where participants can express interest in any or all of the sessions based on their needs.

    • We found it helpful to do a short recap at the beginning of each session to quickly summarize the primary lessons from the prior week.

    • We collected and re-distributed the post-it notes every week so as not to waste them, though some eventually lost their stickiness. In the end we used up roughly three-quarters of a single stack for each color.

    0 0

    We are excited to announce the initial release of a Data Carpentry Social Sciences Curriculum. This is the first Data Carpentry Curriculum to be released targeted towards researchers outside of the life sciences and provides an opportunity to reach out to new communities.

    Peter Smyth has assembled the initial content for these lessons with the guidance of Rachel Gibson, Professor of Political Science at the Cathie Marsh Institute of Social Research, University of Manchester, UK. It was polished during the April 2018 Bug BBQ, and the finishing was done by the lesson Maintainers in coordination with Carpentries staff.

    This curriculum aims at teaching similar skills like the ones covered in the Ecology curriculum. It is focused on best practices for working with rectangular and tidy data. The curriculum covers data organization in spreadsheets, data cleaning with OpenRefine, as well as data manipulation and visualization with R. There are also lessons on SQL and Python that are available but are not part of this initial release.

    As with other materials for Data Carpentry, the same dataset is used across all the lessons. Here, we use a simplified version of a research datasets generated by the SAFI (Studying African Farmer-led Irrigation) research project. This dataset is available on Figshare and is survey data relating to households and agriculture in Tanzania and Mozambique. The survey data was collected through interviews conducted between November 2016 and June 2017 and covered such things as household features (e.g., construction materials used, number of household members), agricultural practices (e.g., water usage), assets (e.g., number and types of livestock) and details about the household members.

    If this curriculum were a piece of software, we would say it is in “beta”. The authors of this curriculum have taught it, and it is now ready to be taught by other members of The Carpentries community. We are interested in your feedback to improve it. We want to ensure it meets the needs and matches the skills that Social Scientists want to acquire when working with data. If you are a social scientist (or studying to become one), please review the lessons and provide us with your feedback. If you are interested in teaching one of the first Social Sciences Data Carpentry workshops, let us know by filling this form.