Quantcast
Channel: Data Carpentry
Viewing all 170 articles
Browse latest View live

My Favorite Tool: Rasterio

$
0
0

Rasterio is a Python spatial data library that has changed the way I work with large spatial datasets.

Ever struggled to do calculations with big datasets in proprietary GIS software? Re-project your results to analyze relationships with other data?

Rasterio makes manipulating gridded spatial data (rasters) simple and brings these data into the Python ecosystem.

Want to do some preliminary analysis on a low-memory machine?

Instead of reading a massive file, read it as windowed chunks.

Need to create quick derivative products like directional gradients?

Raster bands are read as numpy arrays so all your favorite numerical methods are available. Likewise, if you’re dealing with a set of time-referenced images, you can quickly load summary values into a Pandas dataframe for time series analysis.

How the tool helps me in my work

Many of us in the Earth sciences deal with large, co-registered spatial datasets.

For example, changes in vegetation health at a volcano might be captured in a series of satellite images. This could be due to volcano degassing or a more benign environmental change. Direct information about volcanic activity– like gas emissions or earthquakes - is available in other grid formats or as point features. Meteorological data is in yet another raster format. Pre-processing data from multiple data sources can be time-consuming.

The use of Rasterio (and other libraries like scikit-image, fiona, shapely) has greatly streamlined my workflow for loading, transforming, resampling, and correlating these kinds of data to detect and analyze changes.

What I wish someone had told me when I first started learning this tool

I’d say check out the convenience functions show and show_hist in rasterio.plot. They make visualizing multi-band imagery easy.

And finally …

Lots of nice features are being added; it’s in pretty active development.

– Robert Sare / PhD student, earth sciences / Stanford, California, USA


Have you got a favourite tool you would like to tell us about? Please use this form to add a bit of detail and we will do the rest. You can read the background to these posts here, or see what other tools people have written about.


Lesson Infrastructure Subcommittee 2017 November meeting

$
0
0

On 16 November 2017 at 15:00 UTC+0, the Lesson Infrastructure Subcommittee had their 2017 November meeting. This post will cover the topics discussed and their resolutions.

Software Carpentry and Data Carpentry merger

With the merger in 2018, some Git repositories will be owned by a new GitHub organization. The Instructor Training course material has already been moved, you can now find it at http://carpentries.github.io/instructor-training/. Date for the migration will be announced in 2018. Instructions for migrating the repository can be find here.

Syntax Highlight

Thanks to naught101, the next release of our lesson will offer syntax highlighting to our readers. Lesson maintainers might need help to change

~~~
print("Hello World")
~~~
{: .foobar}

to

~~~
print("Hello World")
~~~
{: .language-foobar}

for example. If you want to help, send a pull request to us.

Exercises Throughout the Episodes

After a small discussion, we reached the consensus that it will be better to have exercises throughout the episodes instead of all the exercises at the end of the episode. Lessons will migrate to the new format in a slow pace because this change requires a good amount of work.

Non-English Lessons

If you are involved with us since 2014, you might remember this post about the attempt to translate the lesson to Spanish and this other post announcing the lessons in Korean. During the meeting, we had a conversation about the workflow to translate lessons to other languages, and there is now interest and work on a translation.

Some of the conversation was archived as issues here. If you want to get involved with the translation join the Latinoamerica email list or see the updates.

Windows Installer

In March 2018, a discussion about our recommended text editor created a lot of buzz on the mailing list. The email thread started because sometimes nano wasn’t installed on the learners’ machines. The new version of Git Bash will include nano by default and we have a pull request, thanks to Oliver Stueker, to adopt the new version in our workshop instructions. The pull request will be merged at the end of this year or beginning of 2018.

Next steps

Version 9.3.x lesson template and lesson documentation was released. Maintainers are working to release the new version of the lessons before the end of the year.

The subcommittee will meet again in February to provide an update on some of the topics covered by this post and discuss new requests from the community.

Acknowledgement

Thanks to Alejandra Gonzalez Beltran, Christina Koch, David Pérez-Suárez, Erin Becker, Naupaka Zimmerman and Tracy Teal.

Open Con Berlin - Impressions

$
0
0

Along with fellow Software Carpenters Rayna Harris and Paula Martinez, I attended OpenCon 2017 held over the weekend of 11-13 November, 2017 in Berlin. The conference was held in the Harnack Haus in Dahlem, the home of the Max Planck Society, where the friendly ghosts of Einstein, Heisenberg and other stellar scientists smiled on our endeavours to promote open access, open education and open data.

Harnack Haus, Dahlem

This was a conference with a difference. Most conference goers were very new to this area of work so there was a strong learning aspect to all that unfolded over the three days. Many of the speakers had eye-opening stories to tell about education’s role in transforming lives, whether it be Kholoud Ajarma’s experiences of growing up in a Palestinian refugee camp or Aliya Ibraimova’s work with remote grazing communities in the Kyrgyz mountains.

Rayna's tweet

While the largest cohort (50) were from the US, 47 different countries were represented at OpenCon. Of the 186 listed in the attendance sheet, 132 attendees had GitHub accounts and even more used Twitter (160).

Sessions were a mixture of plenary sessions and small group work. As an early icebreaker, we were put into groups called Story Circles, in which everyone had eight (uninterrupted) minutes to explain what had led them to apply for and attend OpenCon. The sheer diversity of backgrounds and experiences unearthed by this kind of session was astounding. Hearing Thomas Mboa describe teaching Nigerian students without having access to electricity certainly put some of my own workshop issues into perspective.

My story circle

Another eye-opener was the Diversity and Inclusion panel where uncomfortable questions about ‘whose knowledge?’, ‘who has access?’, and ‘who is missing from the discussion?’ put paid to the idea that ‘open’ is a universal, unquestionable good. Speakers from the global south stressed that making knowledge open can seem like a replay of having that knowledge stolen from them during the colonial period. And if ‘open’ does not welcome people of all genders, sexual orientation, color and other forms of diversity, then how ‘open’ it is really?

The quality and clarity of OpenCon recordings mean that these sessions can easily be watched by anyone with an interest in what was said. Footage of the Diversity and Inclusion panel also includes the post-panel discussion.

To help build more local action post-conference, people could opt to work with groups from their own region. Since I was the only Australian there, I chose to work with an Asian group, and helped people from Armenia and Taiwan create ‘empathy maps’ to try to understand the concerns of researchers in their region who might want to work ‘open’ but who face formidable barriers, not least the kinds of behaviours outlined by Laurent Gatto’s and Corina Logan’s ‘Bullied by Bad Science’ campaign.

The final day of OpenCon was a Do-a-Thon - what I would call a sprint or hackathon. For this day, Rayna and Paula marshalled a team from Chile, Argentina and other Spanish-speaking countries to work on the Spanish translations of Carpentry lessons.

Spanish translation Do-A-Thon

This was certainly a one-of-a-kind conference and for those who missed it, session recordings are available online, courtesy of the Right to Research Coalition. The conference was phenomenally well-organised, with terrific food, and people could opt to join Dine-Arounds to ensure that no one had to eat dinner all alone in a strange city. I was very interested in the organization of the conference as I was hoping to get many tips I could use to make next year’s CarpentryCon in Dublin a similar success.

The conference’s leading sponsor was the Max Planck Gesellschaft (Max Planck Society), and the conference was jointly organised by SPARC (the Scholarly Publishing and Academic Resources Coalition, and the Right to Research Coalition. A number of other organisations and foundations were supporting sponsors.

A floor tile at Harnack Haus was inset with Einstein’s signature - you don’t see that every day.

Albert Einstein signature

Call for Contributions: Geospatial and Social Sciences Lessons

$
0
0

The Data Carpentry community has published two full workshop curricula in 2017, both targeted towards researchers in the life sciences. We had our first lesson release for our Ecology lessons in May, followed by the release of the Genomics workshop materials in November.

Data Carpentry lessons are domain-specific, and targeted towards helping researchers in particular domains gain the skills they need to conduct their research efficiently and reproducibly. We’re excited to broaden our reach to researchers outside of the life sciences starting in mid-2018 with the release of curricula for working with Geospatial and Social Sciences data.

Carpentry lessons are developed by and for the community. Lend a hand in developing these materials and preparing them for publication and teaching. There are many ways to get involved, ranging from helping edit the existing lesson drafts, to running pilot workshops, to serving as a Maintainer for the completed lessons. We are also setting up a Curriculum Advisory Committees for the two sets of lessons. Members of these Committees will help ensure these lessons stay up-to-date and continue to serve the needs of our learners.

How you can contribute

We are asking anyone interested in helping now (or in the future) to fill out a brief form before December 20th so that we can organize the effort:

Contribution Form for Geospatial Lessons

Contribution Form for Social Sciences Lessons

If you don’t get a chance to fill out the form by December 20th, but still want to be involved, please get in touch with Erin Becker (ebecker@carpentries.org).

While experience in geospatial or social sciences research, and experience with the Carpentry community are a plus, there are many ways to contribute even if you don’t have this background. Please circulate this link and post to others who might be interested. We be following up near the end of January 2018 to organize everyone and provide more information.

Thanks to everyone who is working to move these lessons to the next stage!

Challenges Assessing Data Science

$
0
0

The Assessment Network was established as a space for those working on assessment within the open source/research computing community to collaborate and share resources. During our quarterly meeting in November, we engaged one another in a conversation revolving around data science education. This meeting was organized and hosted online by Kari Jordan, and six community members attended.

First, we discussed the definitions of data scientist, data analyst, and data engineer; second, we worked in pairs on a set of questions about assessing data science education.

The session was exciting and fruitful, as it combined two topical efforts: on one hand, our organization’s focus on assessment and, on the other hand, our contribution to the global effort in defining, understanding, and shaping the rising field of data science.

Kari Jordan attended a meeting of collaborators from industry, academia, and the non-profit sector to brainstorm the challenges and vision for keeping data science broad. During that meeting, a brainstorming session took place where attendees were asked to come up with core competencies for data science. This was difficult, as each sector identified competencies important for their particular interest. Kari thought it would be a good idea to talk about it with the assessment network.

What is Data Science?

So, what is data science? What are the core competencies? For a positive definition, we turn to the seminal “Data Science Venn Diagram” by Drew Conway, as reproduced by Jake VanderPlas in the preface of his Python Data Science Handbook. Data science lies at the intersection of statistics, computer science, and domain expertise (in industry-friendly terms, or traditional research, in academic terms). Data science is cross-disciplinary by definition. Hardly anyone gets formal training in all three areas. Most working data scientists are self-taught to a certain extent. Basically, it takes a growth mindset to be a data scientist!

For a negative definition (in logician’s terms, i.e., what data science is not), we turn to industry job descriptions. It turns out that Marianne Corvellec served on a panel dedicated to the definition of these emerging occupations. This panel was held in 2016 with Québec’s Sectoral Committee for the ICT Workforce. It brought together industry professionals and HR specialists who would frame the discussion, and resulted in this report (in French; note that “architecte de(s) données” == data engineer and “scientifique de(s) données” == data scientist).

This report is in line with academic sources (e.g., data science curricula at U.S. universities), insofar as a data scientist is not a data engineer. A data engineer takes care of data storage and warehousing; s/he builds, tests, and maintains a data pipeline, which integrates diverse data, transforms, cleans, and structures them. S/he masters big data technologies, such as Apache Hadoop, Apache Spark, and Amazon S3. Data engineers ensure the data are available (and in good shape) for data scientists to work with.

What is a Data Scientist?

More subtly, a data scientist is more than a data analyst. It takes an aptitude for collecting, organizing, and curating data, as well as for thinking analytically. A strong quantitative background is useful but not necessary. Principles and practices from the social sciences or digital humanities are valuable assets; data scientists should be good writers, good storytellers, and good communicators. Perhaps surprisingly, attention to detail is not a key item to include in a data scientist’s skillset; ability to grasp the big picture is much more key, as data scientists will find themselves working at the interface of very different departments or fields (in an industry context, these could be engineering, marketing, or business intelligence).

A data scientist does not master any specific technology to perfection, since s/he dabbles in everything! Unlike the traditional data (or business intelligence) analyst, s/he resorts to several different frameworks and programming languages (as opposed to a given domain-specific platform) in order to leverage data. Plus, the data scientist typically works with datasets coming from multiple sources (as opposed to the traditional data analyst who usually works with a single data source already populated by an ETL solution). Data scientists are flexible with their tools and approaches.

Challenges Assessing Data Science Education

In the second part of the meeting, we split into breakout pairs to discuss the challenges of assessing data science education with respect to Carpentries’ workshops. Brainstorming in parallel lets us cover more ground (breadth), while interacting one-on-one lets us explore different avenues (depth).

One pair focused on the industry perspective, another on the education system, and the third on assessment practices. Kari offered a list of questions to frame the discussion.

Working groups identified challenges for assessing data science education at the object level (i.e., what should this assessment consist of?) and at the meta level (i.e., what favors or hinders the application of assessment?).

At the meta level, the following prompts were discussed (pulled from South Big Data Hub’s Data Divide workshop):

  • Vision for Assessing Data Science Education
  • Stakeholders for Data Science Education
  • What specific skills or resources are most important/lacking to address this challenge?
  • How do our challenges fit into the national landscape?
  • What is the broader impact of addressing our challenges?

Check out the notes from our working groups to see what we came up with!

Now is your chance to tell us what you think. We opened several issues on the Carpentries assessment repo. We’d love to engage you in a rich discussion around this topic. Comment on an issue, and tweet us your thoughts using the hashtag #carpentriesassessment.

When Do Workshops Work?

$
0
0

Author: Karen R. Word

Contributors: Kari Jordan, Erin Becker, Jason Williams, Pamela Reynolds, Amy Hodge, Maxim Belkin, Ben Marwick, and Tracy Teal.

“Null effects of boot camps and short-format training for PhD students in life sciences” is the provocative title of a recent article in the Proceedings of the National Academy of Sciences. Those of us who enthusiastically design and deliver short-format training promptly took note, then scratched our heads a bit. We waited a little for a response, wondering if one or more of the programs that participated in the study might step up to their own defense. Nothing happened. We thought about letting it go - we’ve got our own programs, with distinct goals, and our own assessment data, so maybe this broad-brush study isn’t so important. But … it keeps being raised. Someone will bring it up here and there, asking what we think about it. Whenever this paper comes up in conversation, its title certainly throws some weight around.

So, do workshops work? However certain we may be about the value of our own programs, it seems important to have a little sit-down with this paper and talk about what it means to us, what it doesn’t mean and, most importantly, what it does not address at all: the question of what you can do with a short course [1] when a short course is all you’ve got.

The premise: Spacing instruction over time is better for learning

When given a choice between teaching a two-day short course versus stretching those same hours and content across several weeks of repeated meetings, you can expect to get a lot more learning out of the longer course. This point, described as a core premise for the PNAS study, is essentially irreproachable. There is abundant evidence that distributing instruction over time maximizes learning in comparison with the “massed practice” that occurs when teaching is concentrated into an intensive short-format course.

The problem: Spacing instruction over time is often impractical

Traditional courses match students and faculty on a spaced schedule over a quarter or semester time period. When this format is possible, it should be pursued and optimized, not replaced with short courses.

But when isn’t it possible?

When there aren’t enough instructors. If expertise in an area is scarce, the time demand for distributed training often exceeds the FTEs available to meet that need. Until that shortage can be remedied, a large number of people are left to self-teach or go without. Under these circumstances, short-format workshops are often the only practical way to deliver training to the many more who need it. This is currently the situation with regard to training in data management and analysis, and in many cases, with foundational computing skills as well.

When learners don’t have time. A similar scenario emerges when those in need of training are fully committed to jobs or research or are otherwise unavailable for a time-distributed course. This is the case for most professional-development training. Even within academia, researchers may need training right away and can’t wait for the next semester-long course offering.

When opportunity knocks. Even within graduate school, where long-format courses are the norm, some opportunities are concentrated in time. For example, a short course may be able to attract many faculty simultaneously, allowing students to observe them engaging with and learning from each other. Some research experiences or team-building activities may also be possible only on a concentrated schedule. Also where traditional course curricula can be slow to change, short-courses can permit rapid inclusion of new and needed skills before they can be added elsewhere.

When a little goes a long way. In many of these cases, particularly when training is truly necessary for progress, learners are already engaged in self-teaching, and conveying a large quantity of knowledge may not be as important as providing a boost of confidence and a guide to best-practices as they proceed. Embracing the limitations on learning and leveraging the flexibility and low-stakes of a workshop setting might actually confer an advantage in these areas.

For those of us who work within the short course mandate, then, the question becomes: how can we optimize that format to best meet learners’ needs? When setting goals for impact, we tend to think in terms of how much and what type of impact we can have, and to focus our efforts accordingly.

One reason why the paper by Feldon et al. raises concern within our community is because it frames the question as “whether”. And if the answer to “whether” we can have an impact with a short course is “no”, then we’ve clearly got a problem on our hands. However, in our experience, that simply is not the case. To the contrary, our evidence suggests that there is quite a lot you can accomplish with a workshop when you accept its constraints, focus on specific goals, and leverage the strengths of this format. In the next section, we’ll take a look at the study described in the paper, evaluate its claims, and examine its relevance to the kind of training we provide. Then we’ll circle back around to our goals, our strategies, and the kind of data that we collect to assess and inform the development of our workshops.

The study

There is a lot to love in this work! This was not a simple survey study. They graded papers – multiple times, with validation, for 294 students from 53 institutions. They also repeatedly administered tests and surveys over the course of two years. The dataset must be impressive; we assume there is a LOT of other interesting stuff there that relates to graduate student development and correlates of early success. However, it is hard to know since the data are not publicly available or displayed in the paper. We’re eager to see more publications and perhaps more extensively summarized data come out of this project in the future.

That being said, in discussion with our community members, several persistent questions and concerns emerged. These are a few of the most pertinent questions:

1. How diverse are the program goals? This study lumps together an unknown number of programs administered at the outset of life-science PhD programs as a single treatment. We know only that 53 institutions were sampled and that, of the 294 students in the study, 48 were short-course “participants”. According to Feldon et al., the unifying goal of these programs is to “accelerate the development of doctoral students’ research skills and acculturation”, with emphasis on research design, statistics, writing, and socialization. However, specific emphasis seems likely to vary, and herein lies the concern most frequently voiced in our community: any given program might focus its efforts on any or all of the components identified (research, statistics, writing, or socialization). Indeed, the more astutely a program identifies and engages with short-format limitations, the more focused their program may be. By surveying students across 53 different institutions, it seems highly likely that the specific aims of different programs are heading in different directions. If some programs are particularly good at socializing students and preparing them to cope with the hurdles ahead, while others emphasize grant writing, otherwise ‘significant’ impacts within a sub-group of similar programs are likely to be lost when combined and assessed with the group overall. This is particularly clear if we consider the sample size of 48 students as being further split (e.g. 10, 10, 15, 13) by distinct program emphases. Lumping together successful programs with different aims is likely to show that all are ineffective in each category.

2. How generalizable is this context? The public reading of these findings seems to be, “Too bad short courses don’t work”. However, pre-PhD short-courses are a highly specific and unusual context for a short course. In most other cases, short courses arise out of necessity or unique opportunity, such that there is no subsequent distributed content that re-teaches or even remotely overlaps with the content taught in the short course. In pre-PhD programs, specifically, any effects are potentially in direct competition with gains made via traditional course content. The extent to which the same or overlapping content is otherwise available in each program is also unclear. The authors of this paper might not have intended their work to generalize to other contexts, but the tendency of readers to generalize makes this question a vital one. Benefits of a short course are easily lost in a sea of positive outcomes resulting from graduate training, but that has little bearing on the impact such courses may have when they stand alone.

3. Is this the right experiment to test graduate student outcomes? While we found the methods to be impressive and worthwhile in many respects, several people expressed concern about the two-year assessment regime. This included questions as to whether a graduate student is likely to have matured and, particularly, to have written substantively in their content area within the first two years of study, as well as whether a regime of continuous surveys might itself have a sizeable impact on student development. As with any study that takes volunteers, willingness to participate – both in the short course programs and in the study itself – may bias toward more motivated or engaged students overall, and this could have an impact on the interpretation of the results. These are the sorts of problems that plague any effort at assessing students at scale, and are worth noting only as a standard “grain of salt” with which any study should be (but is not always) considered when it stands alone.

4. How do we go about making short courses more successful? This paper provides no means of evaluating variation between programs, which is really where our interests lie. This is not a criticism: it is simply not the purpose of the paper. It is the next question, the natural response to such results: if these programs really aren’t making a difference, how might we capture the opportunity, with existing funded and institutionally invested programs, to change that? Is it that short course workshops have no impact on anything, or that we need to better understand and plan for what they can accomplish?

We have a few suggestions.

What We Do

Software and Data Carpentry offer short-course training for academics and professional researchers in software and data management skills. Many of our affiliates, who have also contributed to this response, offer other short courses in related subjects. We are all driven to the short-course format out of necessity. We recognize that this format places severe constraints on the quantity of information that can successfully be conveyed, but we design our curriculum and train our instructors specifically to maximize our effectiveness in this format. Here’s how we do it:

Streamline content. We aim to teach only the most immediately useful skills that can be taught and learned quickly. We teach our instructors to resist the urge to “get through everything” or pack extra details into their explanations.

Teach strategically. We keep learners active by using live coding (in which learners work through lessons along with the instructor) and frequent formative assessment. We teach instructors to be mindful of the limitations of short-term memory and to focus instruction and assessments to minimize cognitive load.

Meet learners where they are. Our workshops attract a diverse population of learners, from novices to experienced IT personnel. Our learners use colored sticky notes to indicate when they are stuck. We teach instructors how to use this to adjust their pacing. We also recruit workshop “helpers” who can directly coach learners who may be struggling. The absence of performance-based grades gives us added flexibility to meet diverse needs by generating diverse learning outcomes. Some may learn about the “big picture” of a new programming language by completing a lesson, while others may come away having added “tips and tricks” to their existing skills. This is one area in which workshops may have an advantage over traditional courses, particularly when it comes to confidence- and motivation-based outcomes.

Normalize error and demonstrate recovery. We know and expect that our learners will acquire the bulk of their skill independently. Willingness to make mistakes and awareness of problem-solving strategies are far more crucial to their success than any particular content. We coach our instructors to embrace and even delight in their own errors as an opportunity to model healthy and effective responses.

Explicitly address motivation and self efficacy. One substantial advantage that we have is that our learners attend our workshops because they are motivated to learn precisely what we teach. However, preserving and nurturing that motivation is crucial. Perseverance results not only from embracing error as normal, but also from learners’ personal belief in their ability to succeed. Creating a workshop in which learners can be successful in both learning and in demonstrating to themselves that they have learned is one piece of this. We spend a good deal of time discussing motivation with our instructors. We explain why saying “it’s easy, anyone can do it” is often demotivating. We explore the differences between novice and expert perspectives and coach instructors to be mindful of and to respect the novice experience. We teach instructors to foster a growth mindset in their language and learner interactions. We emphasize that a relaxed, welcoming, and positive workshop experience is one of the most important things we can provide.

Build community. The more people at all levels are able to share what they know, the more efficiently we can distribute knowledge. As a volunteer organization, we have a strong community of instructors, lesson maintainers, and others. As learners progress, they often become involved in this community. In the long range, we hope to create a community that can provide widespread support directly to learners.

What we know about our impact

We have conducted both short-term and long-term follow-up assessments of learners. Data Carpentry post-workshop survey results have always been positive and 85% of learners report that they agree that they would recommend our workshops to a colleague. The Carpentries’ Long-Term Impact survey (n = 530) is designed to determine whether this positive experience and self-reported increase in confidence affects long term outcomes. This survey (full report here) measured self-reported behaviors around good data management practices, change in confidence in open source tools, and other specific program goals. It also explored other ways the workshop may have impacted learners, such as improved research productivity. While Feldon et al. rightly critique self-assessment with regard to performance metrics, many of our target outcomes are more conducive to self-evaluation, e.g. confidence, motivation, and daily work habits. Researchers report increased daily programming usage after attending our two-day coding workshops, and sixty-five percent of respondents report higher confidence in working with data and open source tools as a result of completing the workshop. Our long-term assessment data shows a decline in the percentage of respondents that ‘have not been using these tools’ (-11.1%), and an increase in the percentage of those who now use the tools on daily basis (14.5%). Additional highlights from our long-term survey report include:

  • 77% of respondents reported being more confident in the tools that were covered during their workshop compared to before the workshop.
  • 54% of respondents have made their analyses more reproducible as a result of completing a workshop.
  • 65% of respondents have gained confidence in working with data as a result of completing a workshop.
  • 74% of respondents have recommended our workshops to a friend or colleague.

We see that short-format workshops can be effective at increasing researchers’ confidence, use of coding skills, and adoption of reproducible research perspectives. As a part of the Open Source community, we make all of our survey data and analysis code available in our assessment repository. We welcome people to work with our survey data and ask new questions. Understanding impact is important, and we will continue to keep our community informed with regular releases of survey data and reports. We also have a virtual assessment network which newcomers are welcome to be part of. Please join here if you are interested in discussing assessment efforts in the area of training in research computing.

In Closing …

Our data suggest that we are having a positive impact, and we expect that other short-format programs can be similarly effective. However, this likely requires a focused effort on optimizing within the limitations of a short course, along with clear goals and targeted assessment to demonstrate such efficacy. It is not clear that this was the case for any of the programs surveyed by Feldon et al. , and if it was, it is not clear to us that any such specific and variable successes would be discernable in their study. We agree, however, that under most circumstances, particularly where a large quantity of content needs to be taught, a short-format course should not be favored over any available time-distributed alternative.

We applaud, encourage, and endeavor to support those who have the access and opportunity to conduct long-format training in the subjects we teach. Many members of our community are actively involved in traditional undergraduate and graduate instruction of this kind. Traditional training opportunities will begin to catch up with demand for training in data science generally, but there will always be limitations - concepts or tools that don’t clearly fit into curriculum or new approaches that haven’t yet had a chance to be incorporated. We work on training in these gaps through short courses. It is necessary for us to be as effective as possible to achieve that mission.

So far, we feel comfortable declaring that effort a success.


[1] While the paper refers to programs as either “boot camps”, “bridge programs”, or “short-format training”,
it has been brought to our attention that this usage of “boot camp” can cause some consternation for those with military training or under military regimes. We will therefore use the less-vivid but more-accurate “short course” label for this piece.

Announcing the 2018 Executive Council for the Carpentries

$
0
0

Voting in the election for community governance of the Carpentries (Executive Council, formerly named Steering Committee or Board of Directors) closed last week. Out of the 501 members eligible for voting, 147 ballots were cast (29% turnout).

We are pleased to announce the four newly elected members of the Executive Council:

Raniere and Lex received the highest number of votes and will serve two year terms; Amy and Elizabeth will serve one year terms.

These four elected members will join the five appointed Council members selected from the current leadership of Software Carpentry and Data Carpentry:

  • Karen Cranston is a computational biologist at Agriculture and Agri-Food Canada working on digitisation and integration of biodiversity data. She was the lead PI of the Open Tree of Life phylogeny synthesis project, and serves on the board of the Open Bioinformatics Foundation (OBF). She has been involved with Software Carpentry since 2012, was a founding board member of Data Carpentry, and is a certified instructor trainer.

  • Kate Hertweck is an Assistant Professor at the University of Texas at Tyler. Her research and teaching focuses on bioinformatics and genomics. She completed Instructor Training in fall 2014, served on the Mentoring Subcommittee in 2015, and was elected to the Software Carpentry Steering Committee in 2016 and 2017, also serving as Chair in 2017.

  • Mateusz Kuzak is Scientific Community Manager at the Dutch Tech Center for Life Sciences. He has background in bioinformatics live cell imaging and research software engineering, and is passionate about Open Source, Open Science and Reproducible Research. He is currently working on training activities and coordinating life science data and technology projects in the Netherlands. Mateusz is an Instructor Trainer and was elected to the 2017 Software Carpentry Steering Committee.

  • Sue McClatchy is a bioinformatician and research program manager at the Jackson Laboratory. She provides research training at all academic levels from high school to faculty. She mentors students and develops training materials for analysis of quantitative and high-throughput data. Her expertise in curriculum design and instruction stems from an eight-year science teaching career in schools in the U.S. and Latin America. Sue is an Instructor Trainer and was elected to the 2017 Software Carpentry Steering Committee.

  • Ethan White is an Associate Professor at the University of Florida working on computational and data-intensive ecology. He is a Moore Foundation Investigator in Data Driven Discovery and serves on the board of directors of Impactstory. He has been involved in Software Carpentry since 2009, was a founding member of the Data Carpentry steering committee, wrote the first version of the Data Carpentry Ecology SQL material, and leads the development of the semester long Data Carpentry course for biologists.

Many thanks to all candidates who chose to stand for election. The voting was very close, which reflects the commitment you all show towards service to our community. We are fortunate to have such awesome leaders representing diverse education, careers, and geography. We look forward to continuing to work with you in the Carpentries community, and hope you will consider pursuing other opportunities for leadership.

Also thanks to the outgoing steering committee members:

  • Software Carpentry: Rayna Harris, Christina Koch, Karin Lagesen
  • Data Carpentry: Hilmar Lapp, Aleksandra Pawlik, Karthik Ram

Finally, thanks to all of you across the Carpentries for your continued participation and engagement!

Our 2017 Community Service Award winner: Anelda van der Walt

$
0
0

The Carpentries are happy to honor Anelda van der Walt as our 2017 Community Service Award winner. We received seven independent nominations for Anelda this year, which is a testament to her commitment to both individual people and the broader community.

Starting from scratch, Anelda planted the tiny seed that has now become the phenomenal growth of Software and Data Carpentry in South Africa, not to mention its spread to an ever-growing list of other African countries, such as Namibia, Botswana, Ghana, Gabon, Mauritius, and Ethiopia.

With great determination and persistence, she secured funding to enable a range of workshops and Instructor trainings to be run, such as this first workshop in 2016 and this one in 2017.

Funding meant many participants could travel to and attend training which would normally have been far beyond their reach. She also secured the first ever Software and Data Carpentry membership in South Africa. Through her passion for the Carpentries, she has inspired many people to acquire the command line, HPC and other skills that many thought were beyond their capacity to learn.

![Anelda’s award certificate (/files/2017/12/avdw_award.jpg)]

Since then, she has successfully grown a pool of qualified instructors and has helped hundreds of researchers in South Africa and other African countries develop foundational computational and data skills to drive their research forward. Instructor numbers are now above 22.

Community and capacity building on this scale are much more challenging in southern Africa. Differing research sector priorities, cultural issues, and the availability (or otherwise) of reliable networked infrastructure mean that funding alone is not the only challenge workshop organizers face. Given this, it is commendable that Anelda has worked so hard to foster and support diversity, reaching out to researchers in rural areas and actively working to include groups hitherto under-represented in STEM.

In addition to capacity building, she has taught at more than 10 workshops and has both organized and taught at three instructor training events within the past 18 months. Post-training, she has followed up with trainees to encourage them to complete their check-out, and has helped many begin planning and running their own workshops, oftentimes helping them source extra instructors and helpers.

She encourages Instructors across Africa to interact with each other via African-centred calls like this, both to foster collaboration and to ensure new Instructors feel valued and welcomed into the community.

She also contributes to the global Carpentries community by participating in regular Trainer discussions and meetings and by taking her turn at hosting instructor discussion sessions and teaching demos.

Congratulations Anelda and thank you very much for everything you have done – we honor and value the work you do for the Carpentries.


A Week o' Carpentry

$
0
0

Bringing the Carpentries to the Federal Reserve Board staff

I ran Technical Training for the Federal Reserve Board for over 20 years, designing and implementing training programs tailored to the Fed’s specific needs to ensure staff had the computer skills necessary to do their jobs.

When I started bringing in the Carpentry workshops, research assistants (RAs), who come to the Board on two-year appointments to work with economists, were the intended audience, but permanent staff, notably technology analysts (TAs) and others in the research divisions who support the RAs and economists, also signed up for these workshops. The Board, and the Federal Reserve System as a whole, is definitely dealing with big data, and employs economists, technology analysts, and research assistants to turn collected and procured data into knowledge.

The Carpentry workshops worked well, but attendees mentioned they would like this training even more if it were tailored to the Board and economics research. I decided the best way to make this happen was to grow our own instructors. To do this, we first had to have enough permanent staff take one or both Carpentry workshops to build a pool of potential instructors, so over 2015 and 2016, we held a total of seven workshops, still primarily aimed at RAs, but with TAs and other interested staff encouraged to attend, too.

That done, in late 2016, Tracy Teal, Greg Wilson, Jonah Duckles and I had a conference call about instructor training and the event I had in mind. I also polled colleagues at the twelve Federal Reserve Banks, some of whom had been holding Carpentry workshops, too, to learn whether they might be interested in sending people to Washington DC to train for becoming Carpentry instructors.

Out of this came a week-long event – ‘Week o’ Carpentry’ in my head – involving 18 people for the Instructor Training, nine from the Board and nine from four different FR Banks, six instructors, and a small number of new RAs and other permanent Board staff. Oh, and lots of food, too, including 21 pounds of tangerines.

Out of this came a week-long event – what I called ‘Week o’ Carpentry’ in my head – involving 18 people for the Instructor Training, nine from the Board and nine in total from four different FR Banks, six instructors, and a small number of new RAs and other permanent Board staff. Oh, and lots of food, too, including 21 pounds of tangerines.

Snacks for the workshop

Week o’ Carpentry

Our week looked like this:

Monday-Tuesday: Concurrent sessions of Software Carpentry and Data Carpentry

Wednesday: Discussion, exercises, and getting out of the classroom (this was suggested by Greg Wilson and discussed in the aforementioned conference call and further developed in consultation with others, particularly Varda Faghir Hagh.)

Thursday-Friday: Instructor training

Before the workshops

Though not a requirement of SWC/DC, I required participants for the Instructor Training to have taken a Carpentry workshop, though all of them were competent or experts with most, if not all, of the technologies being taught. Concurrent Data and Software Carpentry workshops on Monday and Tuesday made it possible for those who hadn’t yet attended a workshop to take one, gave others who had taken them the opportunity to serve as helpers and watch the workshop from the standpoint of becoming an instructor (‘fishbowling’), allowed those who’d taken only one workshop to take the other if they wanted, and ensured everyone attending the Instructor Training had exposure to the materials. Though most new RAs arrive during the summer, a few early arrivals attended these May 1-2 workshops, too.

One desire for this training was to build connections among the attendees; this extended even to the SWC/DC instructors. The four instructors, Jeffrey C. Oliver, Varda Faghir Hagh, Lachlan Deer, Easton White, and I met for dinner on Sunday night. It was great to break bread together, and was a good chance to ask and answer questions, talk about the coming week, the Federal Reserve, the reason for these particular workshops (tied as they were to the Instructor Training), and the participants’ overall knowledge level. Many of the Instructor Training participants already had a lot of technical knowledge; those new RAs attending SWC and DC workshops did not, however, and since they needed to master the content, the workshops had to be geared to and taught at their level.

Data and Software Carpentry Workshops

The workshops on Monday and Tuesday were held in adjoining training rooms; the breaks, which occurred at approximately the same time in each workshop, were held in common space outside the classrooms. Having the food outside the classrooms encouraged people to leave the rooms, and by placing no seats in the common space, we encouraged people to move around. It also gave the instructors for the two workshops an opportunity to talk to one another and to participants in whichever workshop they were not teaching.

At the end of the workshop, we said good-bye to two of our excellent instructors, Jeff Oliver and Easton White; the other two instructors, Varda Faghir Hagh and Lachlan Deer, returned for most of the next day’s activities.

Bridge Day

On Wednesday, the wall between the two classrooms was removed, and the 18 people taking instructor training all worked in the same room together for the first time for a very imaginatively titled session called From Software Carpentry/Data Carpentry Participant to Instructor. (Okay, so not so imaginatively titled …)

I wanted to try to build some cohesion among the participants, so to me, it was important that everyone actually learn everyone else’s name. Since it’s so easy not to do this, we did a call and response introduction both during the bridge day and in Instructor Training, For example, I stood up and said, ‘My name is …’ and the expected response from everyone else was ‘Alice’. This continued around the room; each time we did this during the latter part of the week, the responses got louder as more people remembered other people’s names, and by the end of the week, everyone really did know everyone else’s name.

We used Wednesday morning to talk about SWC and DC and how the materials might be adapted for the Federal Reserve and economics research. We also had two 45-minute ‘unconference’ sessions – the unconference idea was new to many – for breakout activities. We took ten minutes of the first 30 minutes of the day to have people suggest activities, writing them on Post-It notes (of course!) and then sticking them on a column in the room; one participant organized them for voting on later in the morning.

The breakout activities did not necessarily have to be discussion or working with others; we had Boomwhackers, singing tubes, and several card and board games available, too, and participants could also use the time to practice new skills from the previous two days on their own or with others.

I scheduled a long lunch break, nearly two hours, to end at the Board’s iconic building about a ten-minute walk from the building in which the training was being held. Some played games in the training room during part of the lunch break, while others went out of the building for lunch.

After a tour of the main building and its Research Library, a walk next door to the National Academy of Sciences to visit the Einstein Memorial, and a heavy-duty recruiting job by class participants to get Varda to give up physics for economics (Lachlan, who is working on his PhD in econ, needed no such encouragement!), Lachlan and Varda left to see more of DC, and the rest of the group returned to the classroom to finish out the day preparing for Instructor Training by reading or reviewing materials and going over the certification tasks.

Instructor Training

The Instructor Training was held on Thursday and Friday, and was taught by Karen Word. One feature of the workshop I particularly liked were the opportunities to work with one or two others during the exercises; Karen was excellent about moving people around and getting them talking to one another.

Several people at the Board team-teach an economics computing course at Howard University; the impetus for one or two of them to take Instructor Training was to learn more about teaching, as they had never had any formal training in how to teach. Now they have! Everyone found the instructor training useful.

After the Week o’ Carpentry

One of the bridge day activities was hashing out how to work on editing Data Carpentry materials to be even more meaningful to Federal Reserve staff and the economics community. To keep up the momentum, I scheduled a hack day (‘Technical Collaboration Workshop’ in Fedspeak) for late May, giving Board staff an opportunity to work on these materials, and I set up regular conference calls so the participants could talk about their progress on certification tasks.

I retired on August 1 to devote more time to the Astrophysics Source Code Library (ASCL). By that time, eight participants had already taught one or more sections of workshops held at the Board or at their Banks; three people, all from the Banks, had been certified, and two from the Board were scheduled to become certified in August and another in September.

The Board’s Technical Training is in excellent hands with my successor Rebecca Steffens. She has continued supporting SWC and DC training, and the Board now has five certified instructors running internal workshops tailored to the economics research of the Board.

And I feel like Yay! Carpentries Training Achievement unlocked (even though I’m no longer at the Board!)!


Description and schedule forFrom Software Carpentry/Data Carpentry Participant to Instructor

This one-day, hands-on collaborative workshop reinforces skills learned in Software Carpentry and/or Data Carpentry and prepares attendees to take Software Carpentry/Data Carpentry Instructor Training. It also provides a forum for discussing adaptations to the Carpentry materials to better fit the FRS environment and lays out a roadmap for those changes and for performing the tasks necessary to complete Software Carpentry/Data Carpentry Instructor certification.

Activities include:

  • Discussion on SWC and DC: Feedback and thoughts
  • Adapting SWC/DC materials for the Federal Reserve
  • Tools and Techniques Break-out Sessions
  • Preparing for Instructor Training

Prerequisites: Completion of either Software Carpentry or Data Carpentry, and enrolment in Software Carpentry/Data Carpentry Instructor Training.

9:00 am - 9:20 amFeedback and thoughts on SWC and DC
9:20 am - 9:30 amUnconference/breakout ideas<br \> Ideas included: <br \> - Gitlab practice/questions<br \> - R and SQL practice/questions<br \> - Bash practice/questions<br \> - OpenRefine practice/questions<br \> - Pandas demonstration<br \> - Modifying SWC/DC for FRS use<br \> - Playing with Boomwhackers/singing tubes/card and board games
9:30 am - 10:30 amAdapting SWC/DC materials for the Federal Reserve<br \> - What datasets would work well for us?<br \> - What lessons should be adapted?
10:30 am - 10:45 amBreak and unconference voting
10:45 am - 12:15 pmUnconference: Tools and Techniques Break-out sessions
10:45 am -11:30 amTools and Techniques Break-out Session 1
11:30 am - 12:15 pmTools and Technical Break-out Session 2
12:15 pm - 2:00 pmLunch
2:00 pm - 3:00 pmEccles building/Research Library tour
3:30 pm - 4:30 pmPrepping for Instructor Training<br \> - Reading<br \> - Certification tasks

The Centrality of the Code of Conduct

$
0
0

This is the first in a series of posts about Carpentries’ teaching practices. Subsequent posts will cover the other practices - live coding, sticky notes, helpers, challenges, etherpads - that make Carpentries’ workshops the success that they are.

I gave a talk recently for the Australian National Data Service on ‘teaching the Carpentries way’. Originally I planned to cover six reasons why our workshops are effective, but ended up covering thirteen, with the thirteenth being the Carpentries’ Code of Conduct.

I left the Code till last because it is probably the most important. Unless people observe the Code of Conduct at workshops, all our other positive teaching practices can count for nothing.

Among other things, the Code of Conduct states:

We are committed to creating a friendly and respectful place for learning, teaching and contributing. All participants in our events and communications are expected to show respect and courtesy to others.

Instructors introduce the Code of Conduct at the start of workshops for a reason. As a community that values diversity and inclusivity, a community dedicated to providing a welcoming and supportive environment for all people regardless of background or identity, the Code sits at the very heart of everything we do.

If someone breaches the Code in a workshop, the Instructor is empowered to warn that person and, if need be, to have that person removed from the workshop. We also encourage Instructors to report the behaviour to us. We have developed a manual on how to enforce the Code.

Harassment is unacceptable, as the Code clearly states:

Harassment is any form of behaviour intended to exclude, intimidate, or cause discomfort. Because we are a diverse community, we may have different ways of communicating and of understanding the intent behind actions. Therefore we have chosen to prohibit certain forms of behaviour in our community, regardless of intent.Read more.

The Code helps people feel safe, which assists their learning. It also makes our workshops accessible to people who might otherwise be marginalised.

While not as serious as religious, sexual or racial vilification, or the other behaviours we prohibit, there are still many off-putting things that people at workshops can do. If learners are worried about being mocked, talked over, treated with sarcasm, condescended to, or made to feel small or stupid for any reason, their enjoyment of the workshop will be diminished, if not extinguished altogether. In those situations, rather than take the offending person on, some people simply prefer to give up on the workshop, thus losing their opportunity to pick up vital skills.

If they choose to stay, the offence will still take up valuable room in their minds, leaving much less space for learning.

It is therefore up to the Instructors to set the workshop tone. If someone is endlessly parading their knowledge, or hogging workshop time to show off, then the Instructors must try to rein that person in. Your learners will be grateful, and they will also feel you are ‘walking the talk’, not just paying lip service to an ideal.

An attendee at a workshop I taught last year wrote on a feedback sticky: “Nice that there are talking rules”. The sticky included a smiley face. A meaningful Code of Conduct makes the workshop better for everyone.

However, it is not only in our workshops that the Code of Conduct applies. We want all interactions within our community to be underpinned by the Code, whether it be contributions to email lists such as Discuss (info on joining all our lists appears on this page), responses to tweets or Facebook postings, discussions about issues raised on GitHub repositories, or contributions to our Slack channel.

As we move forward to the merged Carpentries, it is timely to remind people why we value our the Code of Conduct. The Code is central to our efforts to build a welcoming, diverse, inclusive global community.

New Year Message from the Carpentries' Executive Director

$
0
0

One of my favorite sticky note moments involved two sticky notes, red and green, folded into origami cranes, that I received after a workshop. Maybe it was just idle hands working away during an explanatory section, but it seemed to me to be a quiet ‘thank you’ for the workshop, one that recognized that there were both good things and bad things in the process of learning, but that they could balance out and create something new and beautiful.

In The Carpentries we teach people computational tools and approaches, and work to build confidence, so that people can answer research questions, solve problems, and create new solutions that can impact science, scholarship and society for the better.

When we give others a chance to fulfill their greatest potential, we all win. - Michelle Obama

Data and computers aren’t important on their own. It’s the people who use the data and write the code to answer questions who hold that power. As there are ever more opportunities to use computational skills and approaches, and fewer and fewer people with the right skills, we skew who is able to ask important questions and impact society. That’s why it is so crucial to democratize data skills, scaling who has access to training and creating a community of practice that values not just the tools, but the people who use them and teach them.

Software Carpentry and Data Carpentry are grounded in this idea of respect and inclusivity and value for the people who teach and learn. While workshops were the original seed, the true strength of the Carpentries comes from its community. As Software and Data Carpentry merge to form The Carpentries, we have an opportunity to continue to grow this community, to train others, and each other, and to reach new communities - whether that’s geographies or domains or even a research group down the hall. We not only provide effective training that emphasizes open and reproducible research practices, but we are exemplars of how to work collaboratively and inclusively. We don’t hit the mark every day, but it’s the ethos of who we are and the research we want to see in the world.

As The Carpentries, we’ll continue to work with you, to support you in your teaching and in continuing to learn, to connect you with each other, and respond to what communities of instructors and learners need. Our commitment is to effective, inclusive and accessible training in computational skills, and openness and reproducibility in our own work. We are excited about the continued journey ahead and grateful to everyone who has been an instructor, a learner, a helper, a mentor, a lesson maintainer, or a champion.

I truly believe in the power of our community, and ‘the Carpentries way of teaching’ to change how we work with data and expand the number of people who get to do that work. As Executive Director, I’m grateful for this opportunity to help lead The Carpentries in the next steps of our journey and most importantly to empower the organization and the Carpentries community to reach their full potential.

Tracy Teal, Executive Director


This letter originally appeared in our newsletter, Carpentry Clippings, 9 January, 2018.

A Look Back and A Look Ahead from the Data Carpentry Steering Committee

$
0
0

On May 8-9, 2014 Data Carpentry hosted its first workshop at the National Evolutionary Synthesis Center (NESCent). That workshop came out of an identified unmet need for the skills and perspectives to work effectively and reproducibly with data, as data became more pervasive in many areas of research. The biggest difference from other trainings, including Software Carpentry, was not so much what we taught (specific tools), but how we taught it – with a focus on data management and analysis rather than writing software. In all our workshops, we use real publicly available data and the workshop is a narrative, going from start (data and project organization) to finish (analysis and visualization) through the course of the workshop, providing an onramp to using data skills in friendly, accessible workshops. The goal is to teach skills, and even more importantly, to show people what’s possible, and that they can do it, so people have the confidence and enthusiasm to go on to learn more and use these skills in their own research.

That original goal turned into our vision: Building Community Teaching Universal Data Literacy

As Data Carpentry finishes 2017 and merges its governance with Software Carpentry to become The Carpentries, the reasons people came to that first workshop, and why we teach it, still remain. As we look back now at some of the reasons people came to that first workshop, we still see them echoed by researchers around the world:

  • I’m tired of feeling out of my depth on computation and want to increase my confidence.
  • I usually manage data in spreadsheets and it’s frustrating and I want to do it better.
  • I want to teach a reproducible research class.
  • I want to use public data.
  • I work with faculty at undergraduate institutions and want to teach data practices, but I need to learn it myself first.
  • I’m interested in going in to industry and companies are asking for data analysis experience.
  • I’m trying to reboot my lab’s workflow to manage data and analysis in a more sustainable way.
  • I’m re-entering data over and over again by hand and know there’s a better way.

Now at the end of 2017, and with the support of funding from the Gordon and Betty Moore Foundation, there have been 193 workshops on 6 continents, new curriculum in genomics, developing curriculum in geospatial data and social sciences and reproducible research, numerous talks and presentations, development of an assessment program and 561 badged Data Carpentry instructors. We have partnered with Software Carpentry to develop an instructor training program with now 111 instructor training events and 44 instructor trainers, worked with the community to update our Code of Conduct and mentorship, updated operations and workshop coordination and started a Membership program. We now have 54 Data and Software Carpentry Member organizations, helping to build local capacity for data and computational training.

Every year the Data Carpentry Steering Committee and Executive Director set a strategic plan and goals for the year ahead. For 2017, we made motions to commit to conversations about the merging with Software Carpentry, finalize lessons in ecology, genomics and geospatial data, increase the number of Member organizations and workshops to build sustainability and reach more communities, plan a CarpentryCon in 2018 and build infrastructure for lesson development. We met most of those objectives, with CarpentryCon planned for Dublin, lesson releases for ecology and genomics and planning for geospatial, an increase in memberships and workshops and the initial development of lesson infrastructure (more on that coming soon!). Importantly, for workshops, we continue to meet those goals of building skills and increasing confidence in using them. In post-workshop responses

  • 41.4% of people said they gained some practical knowledge and 58.0% said they gained a great deal of practical knowledge.
  • 67.7% of people agreed that they can immediately apply what they learned at the workshop.
  • 84.7% of people agreed that they would recommend this workshop to a friend or colleague.
  • 49.3% of people reported their data management and analysis skills were somewhat higher post-workshop, while 43% reported their skills were higher or much higher.

What has made Data Carpentry possible and continue to grow has been the amazing support, work and enthusiasm of the community. Software Carpentry welcomed Data Carpentry into the community to expand the kinds of offerings available; instructors stepped up and started to teach Data Carpentry workshops; there have been hundreds of contributions to lessons, from the original Ecology ones to new curriculum; and so many have advocated for hosting workshops or supporting Data Carpentry in many ways.

post-workshop survey word cloud

As you can see in the word cloud of open-ended workshop survey responses (above), what stands out, is “instructors”. They are the strength of this organization. In that vision of building community teaching universal data literacy they are building community with each other, with learners in workshops and in the way that they work and interact with others every day. They are working to make data skills more accessible to all and empowering others to do work that has the potential to change science, scholarship and society. As a Steering Committee and Executive Director, we are truly grateful that so many have wanted to help create and be a part of this journey.

We’re excited for this journey to continue with a merger of Software and Data Carpentry into The Carpentries. The Carpentries will provide more effective operations, support more communities and curriculum, provide the quality training and capacity building that matches a diversity of needs, and bring more people to the growing world of data and computation.

In the transition to The Carpentries, there will continue to be oversight of the Data Carpentry curriculum by a Data Carpentry Advisory Committee, which is still being formed, and members of the current Steering Committee (Karen Cranston and Ethan White) will join the governance of The Carpentries in the Executive Council. Karthik Ram, Aleksandra Pawlik, and Hilmar Lapp will step down from Data Carpentry governance. Tracy Teal, the Data Carpentry Executive Director, will be the Executive Director of The Carpentries. Through this transition in roles, we’re continuing the ethos of that first workshop, developing a shared approach to training people in the computational skills to do their work and building an empowered community that continues learning.

We can’t thank you enough for the support of Data Carpentry over these past few years. We have been fortunate every day to work with this community and look forward to going even further together.

Carpentries Transition From Fiscally Sponsored Project to NumFOCUS Community Alliance Member

$
0
0

Software Carpentry and Data Carpentry have combined their separate projects into a new project, now known as The Carpentries.

As part of this transition, Software Carpentry and Data Carpentry are moving from Fiscally Sponsored Projects with NumFOCUS to The Carpentries with Community Initiatives, whose fiscal sponsorship administration services are better aligned with our emerging needs. The Carpentries looks forward to new opportunities with NumFOCUS and will continue to participate in the NumFOCUS Community as a new member of the NumFOCUS Community Alliance.

As a Community Alliance member, we will be one of the organizations whose mission intersects with that of NumFOCUS and reflects support for open source scientific computing. NumFOCUS cross-promotes activities and events held by members of their Community Alliance in a reciprocal, supportive relationship. In particular, both organizations share a commitment to increasing diversity and inclusion in the data community.

Software Carpentry joined NumFOCUS as a fiscally sponsored project in 2014, and Data Carpentry joined NumFOCUS as a fiscally sponsored project in 2015. Over the ensuing years both Carpentries worked closely together and in early 2017 began discussions about a merger. The merger was approved last summer, and this January marks the milestone of a fully merged project, The Carpentries, with a newly-elected Executive Council.

We are very grateful for the support of NumFOCUS through the initiation, development and growth of Software and Data Carpentry, and for their continued support through this transition to The Carpentries with Community Initiatives. We look forward to continuing to work closely with NumFOCUS as a member of their Community Alliance, to promote the growth and development of the open source scientific computing community.

State of the State: Carpentry Maintainers

$
0
0

All of the great work that we do as The Carpentries is dependent on the hard work and creativity of our community of volunteers. Each of you plays a vital role in helping us fulfill our mission of spreading data skills and computational literacy to researchers and other professionals worldwide. Within our Carpentry community, there are a number of subcommunities of like-minded folks carrying out particular aspects of the Carpentry mission. This blog post is the first in a series focusing on one of our sub-communities - the Maintainers.

All Carpentry lessons are kept up-to-date and functional by a small group of Maintainers. Maintainers review pull requests and issues to their lesson repositories and also engage with the community about the overall goals and direction of their lesson. In the last months of 2017, I engaged the Maintainer group in a set of individual conversations to understand the issues facing the Maintainer community as we grow and to develop an action plan to help this group best move forward their important work.

Through this process, I had the opportunity to talk individually with nearly half (46%) of current Maintainers. The emergent themes from these conversations are detailed in the Carpentry Maintainer Interviews - 2017 Report and include:

  • Aspects of the Maintainer experience that people enjoy are the opportunity to shape the lesson, to interact with the community, the chance to learn new things, and the ability to have a larger impact.
  • Major issues that Maintainers experience are a sense of being overwhelmed, wanting more guidance and help, clarity about their roles and authority, and overall negative feelings about their level of involvement in their lessons.
  • Many Maintainers expressed interest in being more active in the community and getting to know their co-Maintainers better.

I’ve really appreciated the opportunity to talk individually with members of the Maintainer community and understand the issues they face. Based on these conversations, there are five action items in progress to help resolve the issues identified.

1) Changing to an application model for recruiting new Maintainers: To reduce feelings of guilt and pressure to be a Maintainer, we are changing to an application-based model for recruting new Maintainers.
2) Recruiting new Maintainers: In November, The Carpentries put out a call for new Maintainers to join the Maintainer team. There were 23 applicants, of whom 22 were invited to join Maintainer onboarding. Fifteen of these new Maintainers finished onboarding last week.
3) Providing training for new Maintainers: A pilot curriculum for onboarding new Maintainers is in use with the new Maintainers. Contributions are welcome!
4) Facilitating interactions among Maintainers: Monthly meetings for the Maintainers community have been scheduled and are advertised on the community calendar and Maintainer Etherpad.
5) Rethinking the Instructor checkout process: Feedback on proposed changes to the Instructor checkout process has been requested from the Maintainers and Trainers groups and any changes will be implemented in the next months.

I truly appreciate the candor of all those who have shared their experiences with being a Maintainer. Understanding the difficulties we face as a community is a necessary first step to resolving these issues. Please join me in enthusiastically thanking the Maintainers for all of the work that they do in keeping our lessons and workshops running smoothly!

Expanding Library Carpentry: Hiring a Library Carpentry Coordinator

$
0
0

California Digital Library is hiring for a Library Carpentry Project Coordinator!

Job posting and application

The deadline for the Library Carpentry Project Coordinator position has just been extended. New deadline February 21, 2018.

California Digital Library is looking to hire a Library Carpentry Coordinator who will work with The Carpentries to further develop Library Carpentry and continue to build a community and curriculum. A successful Library Carpentry Coordinator is passionate about bringing data skills and perspectives to the library community and supporting libraries in building local capacity for training. The Library Carpentry Coordinator will use product management skills, project management skills, software/data skills, and knowledge of the Data Carpentry & Software Carpentry community to promote and expand the reach of the Library Carpentry across the US. While this is a US based position, the work done will be integrated with and support the efforts of the global Library Carpentry community. Applications are open, and the deadline is February 21, 2018.

This position is funded for two years by an IMLS grant to the California Digital Library (CDL). This position will work closely with The Carpentries as well as the CDL and its digital curation team, University of California Curtation Center (UC3).

Job responsibilities:

  • Evaluate the current offerings of Library Carpentry in the US.
  • Work across stakeholder groups to explore new programs and modules.
  • Work with Carpentry community to recommend new initiatives that are applicable Library Carpentry’s long-range, strategic plans.
  • Identifies, organizes, and participates in program discussions with key advisory groups.
  • Identifies additional opportunities for Library Carpentry module development and works to draft, test, and implement quickly (with stakeholder interaction and feedback).
  • Develop and implement strategies for promotion of Library Carpentry.
  • Create marketing materials, update website content, contacting institutions, and present at workshops and/or conferences.
  • Develops and participates in marketing and professional outreach activities and informational campaigns to raise awareness of Library Carpentry including communicating developments and updates to the community via social media. This includes maintaining blog,Twitter and Facebookaccounts, GitHub Issues, and listservs.
  • Develops project plans including goals, deliverables, resources, budget and timelines for enhancements.
  • Acting as liaison across external agencies to ensure effective production, delivery and operation.
  • Working with Carpentry leadership, assist in Strategic Planning, strategic planning for Library Carpentry, prioritizing and guiding future development of program. Pursue outside collaborations and funding opportunities for future development including developing an engaged community of librarians to contribute to the program. Foster an engaged open community for future maintenance and enhancement.
  • Provides periodic progress reports outlining key activities and progress toward achieving overall goals. Develops and reports on metrics/key performance indicators and provides corresponding analysis.

Job qualifications

  • Bachelor’s degree in related area and/or a minimum of two to three years of data management, data/software skills training, and/or digital curation.
  • Extensive knowledge of the Software Carpentry and/or Data Carpentry pedagogy and community norms.
  • Excellent project management skills and experience coordinating and promoting services. Proven ability to research, collect and analyze information to use in determining product options or alternatives.
  • Demonstrated ability to engage with people in new settings as well as excellent interpersonal and communication skills.
  • Ability to lead, build consensus and promote the exchange of information among project team, internal and external constituencies.
  • Strong oral communication skills to effectively convey and explain information to different audiences.
  • Strong written communications skills to draft clear, concise documentation, reports and specifications.
  • Demonstrated understanding of the research data processes including data collection, description, sharing, preservation, and management.
  • Knowledge of and experience with data driven research and the emerging importance of data management and sharing.

Preferred

  • Familiarity with basic user assessment techniques and demonstrated experience working with library communities.
  • Good understanding of software development techniques and the emerging importance of web scripting and software development in the research process.
  • Experience with Microsoft Office required. Experience with GitHub, Markdown, HTML, and coding languages (e.g., Ruby, Python) are desirable.
  • Entrepreneurial attitude to developing services; self-motivated, with the ability to set and attain goals effectively and the flexibility to adapt to change.
  • Software Carpentry, Library Carpentry, or Data Carpentry Instructor Certifications
  • Advanced degree in library-related field.

Compensation will be negotiated commensurate with experience.

To Apply

To apply, please apply through the California Digital Library job posting with your resume/CV and a cover letter.


Unveiling the Carpentries Logor

$
0
0

Now that Software and Data Carpentry have merged, we wanted a new logo to celebrate our coming together as the Carpentries, and to give that project its own distinct identity.

So ... here is our new logo!

The new logo retains a ‘Carpentry’ feel - at the basic level, it represents a wrench around a hexagonal bolt. Yet it also conveys a sense of exhilaration and celebration - that magic moment when you ‘get’ something and your arms shoot up in celebration. There are many such ‘aha!’ moments in Carpentries’ workshops, so it is fitting that our logo represent not just the hard work of learning (the wrench) but the satisfaction of achievement and mastery that we gain (the ‘Yay!’).

The same, but different

While we have a new logo, and one that we like very much, as far as our community goes, much of what we do will seem unchanged.

As The Carpentries, we will continue to teach foundational computational and data skills to researchers. We will continue to observe and evolve our Code of Conduct. We will continue to grow our memberships, and we will continue to mint new instructors through our Instructor Training program.

The individual ‘Carpentries’ will remain as distinct lesson organizations, and we plan to communicate more as the year goes on about how these projects are evolving. The Software and Data Carpentry logos will remain the same, with The Carpentries an umbrella under which they come together.

Some things are different. Tracy Teal is now our Executive Director, our two staffs have merged with some reshuffling of roles, and we are working as The Carpentries with a new fiscal sponsor, Community Initiatives. Our governance has merged - from having two separate Steering Committees, we now have a brand new Executive Council.

These changes should only enhance what we do by streamlining communications and making our working practices more efficient. We will still support the growth and spread of our community - that will never change.

CarpentryCon 2018 in Dublin will be a celebration of just how far we have come as a community. We hope to see you there.

Keep an eye out for our new website soon!

Scaling Collaborative Curriculum Development for Data Skills Training

$
0
0

We are excited to announce that we have received a grant from the Alfred P. Sloan Foundation to train researchers in essential data skills and build a general framework for collaborative lesson development to scale data training. This grant will allow us to create general infrastructure, guidelines and pathways for community engagement to establish open source lesson development as a practice and enable scalable, collaborative data training. Curriculum developed through this grant will include economics, image analysis and chemistry. This work will be a proving ground for the establishment of infrastructure and processes for collaborative and open lesson development in other domains and topics.

There is increasing awareness of the need for data skills training across a diversity of domains. Necessary core data skills include: data organization and cleaning, exploratory analysis (generating simple summaries and graphs) and data management (sharing, storage).These skills require competency with common file formats, data types, command line tools and the programming languages used by researchers within a particular domain. As different universities and organizations begin to see the need to teach these skills, there is an opportunity to work together to build curriculum, rather than each organization developing their own content in isolation. There is great power in the community perspective, both in what is essential to teach and in the development of materials, and also in the continued re-teaching and re-use of the same materials. This works to improve the content over time and helps keep it relevant and up-to-date. To be maximally effective, these training materials should be accessible, discoverable, and follow best practices derived from educational research.

The Carpentries are at the forefront of this kind of curriculum development, dissemination and teaching strategy. Our curricula are developed collaboratively, are freely available (CC-BY licensed) and are delivered by hundreds of trained volunteer instructors around the world each year. As we have built up our reputation for offering quality trainings, many communities have approached us to help develop and disseminate new content in digital humanities, astronomy, social sciences, library sciences, imaging, economics, chemistry, statistics, high performance computing, meteorology and neuroimaging.

Because of this broad interest, there is a need to establish clearer process and infrastructure to scale this approach to lesson development. This project will build that infrastructure and develop processes that both engages the community and makes contributions more effective and straightforward.

The Carpentries have hired Dr. François Michonneau to lead these curriculum development efforts. We’re excited to welcome François as our Curriculum Development Lead. He brings technical expertise, experience both in teaching and curriculum development, and an inclusive approach to lesson contributions and open source software development to the role. François is a long time Data and Software Carpentry community member. In 2014, as he was planning to teach a semester-long R programming course for the graduate students of the biology department at the University of Florida, he came across Software Carpentry. Intrigued by the pedagogical approach of these workshops, he wanted to experience it firsthand, and attended the inaugural Data Carpentry workshop there. Soon after, he became a certified Instructor, and has since taught a dozen workshops. He is also one of the developers and maintainers for the Data Carpentry R ecology lesson, and has helped organize the development of the Reproducible Science Curriculum lessons. This summer he certified as a a Carpentries Instructor Trainer.

François received his PhD at the University of Florida studying marine biodiversity, where he documented the diversity of sea cucumbers, and in the process described a new species he named after the dog of the museum collection manager assistant (both are very fluffy). As a postdoctoral researcher at the Whitney Marine Laboratory, he synthesized marine biodiversity knowledge available from public databases and used data science approaches to identify knowledge gaps, and levels of digitization for the US marine invertebrate fauna.

François is also the maintainer of several R packages centered around the manipulation of phylogenetic data and an active member of the rOpenSci community. He believes that open and reproducible science can transform the scientific process by generating robust results that can more easily be expanded on. He is excited to lead the growth of the curriculum taught by the Carpentries, so more people and more disciplines can learn the skills needed to conduct open and reproducible research. François is on twitter as @fmic_ on GitHub as fmichonneau, and his personal website is https://francoismichonneau.net.

We are excited about this project and the opportunity to scale open, collaborative curriculum development in The Carpentries and provide frameworks and processes for training in the data science community as a whole. Please join us in welcoming Francois, as he works with the lesson infrastructure community on ideas for updates and in supporting the lesson development and maintainers community.

Valerie Aurora to Keynote at CarpentryCon 2018

$
0
0

Valerie Aurora

The Carpentries are excited to announce that Valerie Aurora will be one of four keynote speakers at this May’s CarpentryCon in Dublin.

Valerie is a software engineer turned diversity and inclusion consultant.

We want CarpentryCon 2018 to be a truly global, diverse and inclusive event, which is why we are so happy that Valerie has accepted our offer to speak there.

Valerie founded Frame Shift Consulting, which helps technology organizations build in-house expertise and leadership in diversity and inclusion.

Members of our community may know her as the creator and facilitator of Ally Skills Workshops, which teach simple, everyday ways for people who have more power and influence to support people with less. Valerie has taught these skills to thousands of people.

In addition to keynoting, Valerie has offered to teach an Ally Skills Workshop at CarpentryCon. I am sure many of our community will scramble to attend that.

She was a co-founder of the Ada Initiative, which, between 2011 and 2015, supported women in open technology and culture by producing codes of conduct and anti-harassment policies, advocating for gender diversity and teaching ally skills.

Valerie also helped establish Double Union a non-profit which supports women and non-binary people in technology and the arts.

She previously worked for more than 10 years as a Linux kernel and file systems developer at Red Hat, IBM, Intel, and other technology companies.

Valerie is on Twitter.

Register here for CarpentryCon 2018.

Mentoring Groups Showcase their Accomplishments

$
0
0

We just finished our second round of mentoring groups and had an amazing showcase of their work and ideas. In this round, we were more specific and focused on multiple topic areas. There were groups on community building, lesson maintenance, and preparing for instructor checkout.

The feedback and outcomes were great! Participants were able to focus on specific goals, including teaching their first workshop and developing new lesson contribution material. Being a part of a group that addressed something important to them, (ex. developing communities in Japan and Singapore), made the mentoring groups powerful and enjoyable.

Read about what community members have accomplished in these mentoring groups, find out how to get involved, or give feedback on how mentoring would be useful to you!

The second round of the Carpentries mentoring groups began October 25, 2017. Goals of the revised mentoring groups were to offer curriculum-specific mentoring, and encourage groups to focus their efforts on lesson maintenance, teaching, organizing workshops, or building local communities. If you missed the wrap-up of the first round of mentoring, check out this blog post.

Over a period of four months, 20 mentors and 39 mentees (a total 14 groups) representing eight time zones met either in-person or virtually to accomplish specific goals. Kari Jordan hosted a training session on November 9, 2017 to help mentors prepare for their first meeting, and to discuss goal setting. On November 28, 2017, mentors participated in a “power check-in” to discuss issues and any concerns they were having with their groups. These were mostly scheduling-related as we were nearing the holiday season.

Results from the mid-program survey showed that several groups were working on projects to build local communities, and several group members were preparing to teach specific Carpentries lessons. Participants identified several resources that would improve their experience, such as a dedicated Slack channel, and more time to work with their groups. A mentoring Slack channel was created, and the program was extended from January 10, 2018 to February 6, 2018.

The culmination of this mentoring period was the mentoring groups virtual showcase, which took place February 6, 2018. Two showcases to accommodate multiple time zones hosted a total of 25 attendees. During this time, mentoring group representatives presented PowerPoint slides showcasing either what they learned, or something cool they developed during their mentoring period. A lively discussion took place on the Etherpad, and several resources were added to the mentoring-groups repo on GitHub. Here are a few highlights from the showcase:

  • Kayleigh taught her first workshop as a qualified instructor at the first ever Library Carpentry workshop in Ethiopia.
  • Katrin completed the check-out process and onboarded as an r-novice-inflammation Maintainer.
  • One of the African mentoring groups emphasized the value of community in helping to get workshops organized.
  • A local data science community was started at the University of Konstanz, Germany.
  • Robin got to live demo an R lesson.
  • Blake’s workshop ran in January and got snowed in!
  • Chris developed a mentoring program plan.
  • Simon found out who to get in touch with, and will be an instructor at a Stanford workshop in March.
  • Toby contributed improvements to mentoring material on GitHub.
  • One group drafted step-by-step instructions for beginners to contribute to lessons using the terminal or a web browser.
  • Malvika contributed to the CarpentryCon taskforce and shared ideas with Kari for the next mentoring round.
  • One group used the evolving community ‘cookbook’ to plan activities in Japan and Singapore.

Did you miss the showcase? Check out the recording from the second showcase!

Why should you participate in mentoring?

Both mentors and mentees received certificates for participating in their groups, and several group members plan to continue working together beyond this round of mentoring. Mentoring group participants were asked to tell the community why they should participate in mentoring. Here is what they said:

  • It gives you a direct and personal channel for questions and support.
  • You get to know other Carpentries colleagues from across the world at differing levels of experience.
  • You accomplish goals you probably wouldn’t have accomplished otherwise.
  • You learn new things and gain new perspectives.
  • You meet more community members.
  • It speeds up instructor checkout, and brings forward first teaching experience at a workshop.
  • You become more confident contributors and practice PR’s on lessons before submitting them to the main repo.
  • It’s very rewarding to help people with SWC material in an in-depth, one-on-one setting.
  • You never know what connections you will make!
  • You gain community connections and support to grow our collective abilities.
  • You get advice on organising workshops.
  • You get help when starting a community from scratch.
  • It’s a great opportunity to learn more about the Carpentries programs and to make connections with current instructors.
  • You receive positive feedback for running a workshop.

Mentoring group meetup in Germany. Photo credit: G Zeller (EMBL Bio-IT)Mentoring group meetup in Germany. Photo credit: G Zeller (EMBL Bio-IT)

Where do we go from here?

The post-mentoring survey results showed that the major concerns during this period of mentoring were finding a schedule that fit everyone. Additionally, several participants suggested that a longer duration would be useful. Lastly, there were recommendations for open selection of mentoring groups.

As a result of the feedback from this round of mentoring, and discussions among the mentoring subcommittee, we are in the process of developing the instructor discussion sessions such that they include ongoing mentoring for new instructors and experienced community members. Look for the next round of mentoring to begin this April!

In the meantime, get involved with mentoring by requesting to join the mentoring Slack channel and/or attending the next Mentoring Subcommittee meeting.

Are these things that would help you, or keep you engaged with the Carpentries? Tweet us your thoughts (@datacarpentry, @swcarpentry, @thecarpentries, @drkariljordan) using the hashtag #carpentriesmentoring.

State of the State: Instructor Checkout

$
0
0

This blog post is the second in a series examining the roles and contributions of the different parts of the Carpentry community. In case you missed it - read the first post in this series, about Maintainers.

Carpentry Instructors are the core of our community. Without Instructors, there would be no workshops. Because of the vital role that Instructors play in advancing the Carpentry mission, we as a community take preparing Instructors very seriously. Before becoming certified Instructors, trainees must show familiarity with our curriculum, demonstrate their teaching skills (with a focus on the Carpentry pedagogical model), and interact with the broader Carpentry community. Software Carpentry Instructors also need to demonstrate familiarity with Git and GitHub.

Since 2015, these goals have been served by a three-part checkout mechanism: Submitting a lesson contribution, Participating in an instructor discussion session, and Presenting a short teaching demonstration.

These steps are estimated to take a total of 8-10 hours and are overseen by the Maintainers group, the Mentoring Subcommittee, and the Trainers group, respectively. These groups frequently discuss how to ensure that our checkout process is continuing to meet the needs of new Instructors as our community grows and changes.

Recently, staff facilitated a set of discussions with the Mentoring Subcommittee, Maintainers, and Trainers, to understand whether there were reasons to remove one or more of the steps of the checkout process, and more broadly, to understand how members of these groups feel these steps are meeting Instructor’s needs. Getting input from each of these groups proved to be vital, as different parts of the community had different perspectives about these steps and how they affect Instructor preparation. Although the decision at this time was to maintain the current checkout process, there were many ideas raised about how we can change this process in the future to better align with the needs of new Instructors.

The three topics raised for discussion were:

  • Removing the requirement for trainees to submit a lesson contribution. This was brought to the Maintainers and Trainers groups for discussion. Many voiced concerns that, without this requirement, new Instructors would not be prepared to contribute to lessons in the future. Other options to require trainees to use GitHub without increasing Maintainer workload were discussed. The decision was to make no changes to this requirement at this time, but to clearly communicate to trainees that rather than creating new issues or putting in unsolicited PRs, they can help by contributing to existing issues, reviewing existing PRs, and putting in PRs for requested issues. The Trainers group will work to better communicate this with new trainees. On the Maintainers side, there is work ongoing to update issue labels to help guide contributions.
  • Removing the requirement for trainees to participate in an instructor discussion before becoming Certified. This was brought to the Mentoring Subcommittee and the Trainers group for discussion. In both groups, people expressed concern that these discussions were necessary to prepare new Instructors to teach. The decision was not to change this requirement at this time, but to continue exploring other opportunities to provide mentorship for new Instructors.
  • Removing the requirement that trainees must complete their teaching demo with a Trainer who did not teach their instructor training. This policy was intended to avoid conflicts of interest by requiring that new Instructors were approved by Trainers outside of their institutions, however, it inadvertently disadvantaged new Instructors in geographic areas with fewer Trainers. The Trainers group passed this change with a vote of 22:1 with 1 abstaining. Trainers are still encouraged to identify any potential conflicts of interest.

To summarize, although all three steps of the checkout process will remain the same for the time being (with the minor change that trainees will now be able to schedule their teaching demonstration with any Trainer), there have been many good ideas generated during this discussion process that will help us as we plan future revisions to continue to meet the needs of our community. If you’re interested in learning more about these conversations, read:

Preparing new Instructors is an important job that is shared across our community. There are many ways you can be involved!

Viewing all 170 articles
Browse latest View live




Latest Images