Author Archives: Research Blog

Google’s Course Builder 1.9 improves instructor experience and takes Skill Maps to the next level



(Cross-posted on the Google for Education Blog)

When we last updated Course Builder in April, we said that its skill mapping capabilities were just the beginning. Today’s 1.9 release greatly expands the applicability of these skill maps for you and your students. We’ve also significantly revamped the instructor’s user interface, making it easier for you to get the job done while staying out of your way while you create your online courses.

First, a quick update on project hosting. Course Builder has joined many other Google open source projects on GitHub (download it here). Later this year, we’ll consolidate all of the Course Builder documentation, but for now, get started at Google Open Online Education.

Now, about those features:
  • Measuring competence with skill maps
    In addition to defining skills and prerequisites for each lesson, you can now apply skills to each question in your courses’ assessments. By completing the assessments and activities, learners will be able to measure their level of competence for each skill. For instance, here’s what a student taking Power Searching with Google might see:
This information can help guide them on which sections of the course to revisit. Or, if a pre-test is given, students can focus on the lessons addressing their skill gaps.

To determine how successful the content is at teaching the desired skills across all students, an instructor can review students’ competencies on a new page in the analytics section of the dashboard.

  • Improving usability when creating a course Course Builder has a rich set of capabilities, giving you control over every aspect of your course -- but that doesn’t mean it has to be hard to use. Our goal is to help you spend less time setting up your course and more time educating your students. We’ve completely reorganized the dashboard, reducing the number of tabs and making the settings you need clearer and easier to find.
We also added in-place previewing, so you can quickly edit your content and immediately see how it will look without needing to reload any pages.
For a full list of the other features added in this release (including the ability for students to delete their data upon unenrollment and removal of the old Files API), see the release notes. As always, please let us know how you use these new features and what you’d like to see in Course Builder next to help make your online course even better.

In the meantime, take a look at a couple recent online courses that we’re pretty excited about: Sesame Street’s Make Believe with Math and our very own Computational Thinking for Educators.

The neural networks behind Google Voice transcription



Over the past several years, deep learning has shown remarkable success on some of the world’s most difficult computer science challenges, from image classification and captioning to translation to model visualization techniques. Recently we announced improvements to Google Voice transcription using Long Short-term Memory Recurrent Neural Networks (LSTM RNNs)—yet another place neural networks are improving useful services. We thought we’d give a little more detail on how we did this.

Since it launched in 2009, Google Voice transcription had used Gaussian Mixture Model (GMM) acoustic models, the state of the art in speech recognition for 30+ years. Sophisticated techniques like adapting the models to the speaker's voice augmented this relatively simple modeling method.

Then around 2012, Deep Neural Networks (DNNs) revolutionized the field of speech recognition. These multi-layer networks distinguish sounds better than GMMs by using “discriminative training,” differentiating phonetic units instead of modeling each one independently.

But things really improved rapidly with Recurrent Neural Networks (RNNs), and especially LSTM RNNs, first launched in Android’s speech recognizer in May 2012. Compared to DNNs, LSTM RNNs have additional recurrent connections and memory cells that allow them to “remember” the data they’ve seen so far—much as you interpret the words you hear based on previous words in a sentence.

By then, Google’s old voicemail system, still using GMMs, was far behind the new state of the art. So we decided to rebuild it from scratch, taking advantage of the successes demonstrated by LSTM RNNs. But there were some challenges.
An LSTM memory cell, showing the gating mechanisms that allow it to store
and communicate information. Image credit: Alex Graves
There’s more to speech recognition than recognizing individual sounds in the audio: sequences of sounds need to match existing words, and sequences of words should make sense in the language. This is called “language modeling.” Language models are typically trained over very large corpora of text, often orders of magnitude larger than the acoustic data. It’s easy to find lots of text, but not so easy to find sources that match naturally spoken sentences. Shakespeare’s plays in 17th-century English won’t help on voicemails.

We decided to retrain both the acoustic and language models, and to do so using existing voicemails. We already had a small set of voicemails users had donated for research purposes and that we could transcribe for training and testing, but we needed much more data to retrain the language models. So we asked our users to donate their voicemails in bulk, with the assurance that the messages wouldn’t be looked at or listened to by anyone—only to be used by computers running machine learning algorithms. But how does one train models from data that’s never been human-validated or hand-transcribed?

We couldn’t just use our old transcriptions, because they were already tainted with recognition errors—garbage in, garbage out. Instead, we developed a delicate iterative pipeline to retrain the models. Using improved acoustic models, we could recognize existing voicemails offline to get newer, better transcriptions the language models could be retrained on, and with better language models we could recognize again the same data, and repeat the process. Step by step, the recognition error rate dropped, finally settling at roughly half what it was with the original system! That was an excellent surprise.

There were other (not so positive) surprises too. For example, sometimes the recognizer would skip entire audio segments; it felt as if it was falling asleep and waking up a few seconds later. It turned out that the acoustic model would occasionally get into a “bad state” where it would think the user was not speaking anymore and what it heard was just noise, so it stopped outputting words. When we retrained on that same data, we’d think all those spoken sounds should indeed be ignored, reinforcing that the model should do it even more. It took careful tuning to get the recognizer out of that state of mind.

It was also tough to get punctuation right. The old system relied on hand-crafted rules or “grammars,” which, by design, can’t easily take textual context into account. For example, in an early test our algorithms transcribed the audio “I got the message you left me” as “I got the message. You left me.” To try and tackle this, we again tapped into neural networks, teaching an LSTM to insert punctuation at the right spots. It’s still not perfect, but we’re continually working on ways to improve our accuracy.

In speech recognition as in many other complex services, neural networks are rapidly replacing previous technologies. There’s always room for improvement of course, and we’re already working on new types of networks that show even more promise!

The reusable holdout: Preserving validity in adaptive data analysis



Machine learning and statistical analysis play an important role at the forefront of scientific and technological progress. But with all data analysis, there is a danger that findings observed in a particular sample do not generalize to the underlying population from which the data were drawn. A popular XKCD cartoon illustrates that if you test sufficiently many different colors of jelly beans for correlation with acne, you will eventually find one color that correlates with acne at a p-value below the infamous 0.05 significance level.
Image credit: XKCD
Unfortunately, the problem of false discovery is even more delicate than the cartoon suggests. Correcting reported p-values for a fixed number of multiple tests is a fairly well understood topic in statistics. A simple approach is to multiply each p-value by the number of tests, but there are more sophisticated tools. However, almost all existing approaches to ensuring the validity of statistical inferences assume that the analyst performs a fixed procedure chosen before the data are examined. For example, “test all 20 flavors of jelly beans”. In practice, however, the analyst is informed by data exploration, as well as the results of previous analyses. How did the scientist choose to study acne and jelly beans in the first place? Often such choices are influenced by previous interactions with the same data. This adaptive behavior of the analyst leads to an increased risk of spurious discoveries that are neither prevented nor detected by standard approaches. Each adaptive choice the analyst makes multiplies the number of possible analyses that could possibly follow; it is often difficult or impossible to describe and analyze the exact experimental setup ahead of time.

In The Reusable Holdout: Preserving Validity in Adaptive Data Analysis, a joint work with Cynthia Dwork (Microsoft Research), Vitaly Feldman (IBM Almaden Research Center), Toniann Pitassi (University of Toronto), Omer Reingold (Samsung Research America) and Aaron Roth (University of Pennsylvania), to appear in Science tomorrow, we present a new methodology for navigating the challenges of adaptivity. A central application of our general approach is the reusable holdout mechanism that allows the analyst to safely validate the results of many adaptively chosen analyses without the need to collect costly fresh data each time.

The curse of adaptivity

A beautiful example of how false discovery arises as a result of adaptivity is Freedman’s paradox. Suppose that we want to build a model that explains “systolic blood pressure” in terms of hundreds of variables quantifying the intake of various kinds of food. In order to reduce the number of variables and simplify our task, we first select some promising looking variables, for example, those that have a positive correlation with the response variable (systolic blood pressure). We then fit a linear regression model on the selected variables. To measure the goodness of our model fit, we crank out a standard F-test from our favorite statistics textbook and report the resulting p-value.
Inference after selection: We first select a subset of the variables based on a data-dependent criterion and then fit a linear model on the selected variables.
Freedman showed that the reported p-value is highly misleading - even if the data were completely random with no correlation whatsoever between the response variable and the data points, we’d likely observe a significant p-value! The bias stems from the fact that we selected a subset of the variables adaptively based on the data, but we never account for this fact. There is a huge number of possible subsets of variables that we selected from. The mere fact that we chose one test over the other by peeking at the data creates a selection bias that invalidates the assumptions underlying the F-test.

Freedman’s paradox bears an important lesson. Significance levels of standard procedures do not capture the vast number of analyses one can choose to carry out or to omit. For this reason, adaptivity is one of the primary explanations of why research findings are frequently false as was argued by Gelman and Loken who aptly refer to adaptivity as “garden of the forking paths”.

Machine learning competitions and holdout sets

Adaptivity is not just an issue with p-values in the empirical sciences. It affects other domains of data science just as well. Machine learning competitions are a perfect example. Competitions have become an extremely popular format for solving prediction and classification problems of all sorts.

Each team in the competition has full access to a publicly available training set which they use to build a predictive model for a certain task such as image classification. Competitors can repeatedly submit a model and see how the model performs on a fixed holdout data set not available to them. The central component of any competition is the public leaderboard which ranks all teams according to the prediction accuracy of their best model so far on the holdout. Every time a team makes a submission they observe the score of their model on the same holdout data. This methodology is inspired by the classic holdout method for validating the performance of a predictive model.
Ideally, the holdout score gives an accurate estimate of the true performance of the model on the underlying distribution from which the data were drawn. However, this is only the case when the model is independent of the holdout data! In contrast, in a competition the model generally incorporates previously observed feedback from the holdout set. Competitors work adaptively and iteratively with the feedback they receive. An improved score for one submission might convince the team to tweak their current approach, while a lower score might cause them to try out a different strategy. But the moment a team modifies their model based on a previously observed holdout score, they create a dependency between the model and the holdout data that invalidates the assumption of the classic holdout method. As a result, competitors may begin to overfit to the holdout data that supports the leaderboard. This means that their score on the public leaderboard continues to improve, while the true performance of the model does not. In fact, unreliable leaderboards are a widely observed phenomenon in machine learning competitions.

Reusable holdout sets

A standard proposal for coping with adaptivity is simply to discourage it. In the empirical sciences, this proposal is known as pre-registration and requires the researcher to specify the exact experimental setup ahead of time. While possible in some simple cases, it is in general too restrictive as it runs counter to today’s complex data analysis workflows.

Rather than limiting the analyst, our approach provides means of reliably verifying the results of an arbitrary adaptive data analysis. The key tool for doing so is what we call the reusable holdout method. As with the classic holdout method discussed above, the analyst is given unfettered access to the training data. What changes is that there is a new algorithm in charge of evaluating statistics on the holdout set. This algorithm ensures that the holdout set maintains the essential guarantees of fresh data over the course of many estimation steps.
The limit of the method is determined by the size of the holdout set - the number of times that the holdout set may be used grows roughly as the square of the number of collected data points in the holdout, as our theory shows.

Armed with the reusable holdout, the analyst is free to explore the training data and verify tentative conclusions on the holdout set. It is now entirely safe to use any information provided by the holdout algorithm in the choice of new analyses to carry out, or the tweaking of existing models and parameters.

A general methodology

The reusable holdout is only one instance of a broader methodology that is, perhaps surprisingly, based on differential privacy—a notion of privacy preservation in data analysis. At its core, differential privacy is a notion of stability requiring that any single sample should not influence the outcome of the analysis significantly.
Example of a stable learning algorithm: Deletion of any single data point does not affect the accuracy of the classifier much.
A beautiful line of work in machine learning shows that various notions of stability imply generalization. That is any sample estimate computed by a stable algorithm (such as the prediction accuracy of a model on a sample) must be close to what we would observe on fresh data.

What sets differential privacy apart from other stability notions is that it is preserved by adaptive composition. Combining multiple algorithms that each preserve differential privacy yields a new algorithm that also satisfies differential privacy albeit at some quantitative loss in the stability guarantee. This is true even if the output of one algorithm influences the choice of the next. This strong adaptive composition property is what makes differential privacy an excellent stability notion for adaptive data analysis.

In a nutshell, the reusable holdout mechanism is simply this: access the holdout set only through a suitable differentially private algorithm. It is important to note, however, that the user does not need to understand differential privacy to use our method. The user interface of the reusable holdout is the same as that of the widely used classical method.

Reliable benchmarks

A closely related work with Avrim Blum dives deeper into the problem of maintaining a reliable leaderboard in machine learning competitions (see this blog post for more background). While the reusable holdout could directly be used for this purpose, it turns out that a variant of the reusable holdout, we call the Ladder algorithm, provides even better accuracy.

This method is not just useful for machine learning competitions, since there are many problems that are roughly equivalent to that of maintaining an accurate leaderboard in a competition. Consider, for example, a performance benchmark that a company uses to test improvements to a system internally before deploying them in a production system. As the benchmark data set is used repeatedly and adaptively for tasks such as model selection, hyper-parameter search and testing, there is a danger that eventually the benchmark becomes unreliable.

Conclusion

Modern data analysis is inherently an adaptive process. Attempts to limit what data scientists will do in practice are ill-fated. Instead we should create tools that respect the usual workflow of data science while at the same time increasing the reliability of data driven insights. It is our goal to continue exploring techniques that can help to create more reliable validation techniques and benchmarks that track true performance more accurately than existing methods.

Young people who are changing the world through science



(Cross-posted from the Google for Education Blog)

Sometimes the biggest discoveries are made by the youngest scientists. They’re curious and not afraid to ask, and it’s this spirit of exploration that leads them to try, and then try again. Thousands of these inquisitive young minds from around the world submitted projects for this year’s Google Science Fair, and today we’re thrilled to announce the 20 Global Finalists whose bright ideas could change the world.

From purifying water with corn cobs to transporting Ebola antibodies through silk; extracting water from air or quickly transporting vaccines to areas in need, these students have all tried inventive, unconventional things to help solve challenges they see around them. And did we mention that they’re all 18 or younger?

We’ll be highlighting each of the impressive 20 finalist projects over the next 20 days in the Spotlight on a Young Scientist series on the Google for Education blog to share more about these inspirational young people and what inspires them.
Then on September 21st, these students will join us in Mountain View to present their projects to a panel of notable international scientists and scholars, eligible for a $50,000 scholarship and other incredible prizes from our partners at LEGO Education, National Geographic, Scientific American and Virgin Galactic.

Congratulations to our finalists and everyone who submitted projects for this year’s Science Fair. Thank you for being curious and brave enough to try to change the world through science.

See through the clouds with Earth Engine and Sentinel-1 Data



This year the Google Earth Engine team attended the European Geosciences Union General Assembly meeting in Vienna, Austria to engage with a number of European geoscientific partners. This was just the first of a series of European summits the team has attended over the past few months, including, most recently, the IEEE Geoscience and Remote Sensing Society meeting held last week in Milan, Italy.
Noel Gorelick presenting Google Earth Engine at EGU 2015.
We are very excited to be collaborating with many European scientists from esteemed institutions such as the European Commission Joint Research Centre, Wageningen University, and University of Pavia. These researchers are utilizing the Earth Engine geospatial analysis platform to address issues of global importance in areas such as food security, deforestation detection, urban settlement detection, and freshwater availability.

Thanks to the enlightened free and open data policy of the European Commission and European Space Agency, we are pleased to announce the availability of Copernicus Sentinel-1 data through Earth Engine for visualization and analysis. Sentinel-1, a radar imaging satellite with the ability to see through clouds, is the first of at least 6 Copernicus satellites going up in the next 6 years.
Sentinel-1 data visualized using Earth Engine, showing Vienna (left) and Milan (right).
Wind farms seen off the Eastern coast of England.
This radar data offers a powerful complement to other optical and thermal data from satellites like Landsat, that are already available in the Earth Engine public data catalog. If you are a geoscientist interested in accessing and analyzing the newly available EC/ESA Sentinel-1 data, or anything else in our multi-petabyte data catalog, please sign up for Google Earth Engine.

We look forward to further engagements with the European research community and are excited to see what the world will do with the data from the European Union's Copernicus program satellites.

ICSE 2015 and Software Engineering Research at Google



The large scale of our software engineering efforts at Google often pushes us to develop cutting-edge infrastructure. In May 2015, at the International Conference on Software Engineering (ICSE 2015), we shared some of our software engineering tools and practices and collaborated with the research community through a combination of publications, committee memberships, and workshops. Learn more about some of our research below (Googlers highlighted in blue).

Google was a Gold supporter of ICSE 2015.

Technical Research Papers:
A Flexible and Non-intrusive Approach for Computing Complex Structural Coverage Metrics
Michael W. Whalen, Suzette Person, Neha Rungta, Matt Staats, Daniela Grijincu

Automated Decomposition of Build Targets
Mohsen Vakilian, Raluca Sauciuc, David Morgenthaler, Vahab Mirrokni

Tricorder: Building a Program Analysis Ecosystem
Caitlin Sadowski, Jeffrey van Gogh, Ciera Jaspan, Emma Soederberg, Collin Winter

Software Engineering in Practice (SEIP) Papers:
Comparing Software Architecture Recovery Techniques Using Accurate Dependencies
Thibaud Lutellier, Devin Chollak, Joshua Garcia, Lin Tan, Derek Rayside, Nenad Medvidovic, Robert Kroeger

Technical Briefings:
Software Engineering for Privacy in-the-Large
Pauline Anthonysamy, Awais Rashid

Workshop Organizers:
2nd International Workshop on Requirements Engineering and Testing (RET 2015)
Elizabeth Bjarnason, Mirko Morandini, Markus Borg, Michael Unterkalmsteiner, Michael Felderer, Matthew Staats

Committee Members:
Caitlin Sadowski - Program Committee Member and Distinguished Reviewer Award Winner
James Andrews - Review Committee Member
Ray Buse - Software Engineering in Practice (SEIP) Committee Member and Demonstrations Committee Member
John Penix - Software Engineering in Practice (SEIP) Committee Member
Marija Mikic - Poster Co-chair
Daniel Popescu and Ivo Krka - Poster Committee Members

How Google Translate squeezes deep learning onto a phone



Today we announced that the Google Translate app now does real-time visual translation of 20 more languages. So the next time you’re in Prague and can’t read a menu, we’ve got your back. But how are we able to recognize these new languages?

In short: deep neural nets. When the Word Lens team joined Google, we were excited for the opportunity to work with some of the leading researchers in deep learning. Neural nets have gotten a lot of attention in the last few years because they’ve set all kinds of records in image recognition. Five years ago, if you gave a computer an image of a cat or a dog, it had trouble telling which was which. Thanks to convolutional neural networks, not only can computers tell the difference between cats and dogs, they can even recognize different breeds of dogs. Yes, they’re good for more than just trippy art—if you're translating a foreign menu or sign with the latest version of Google's Translate app, you're now using a deep neural net. And the amazing part is it can all work on your phone, without an Internet connection. Here’s how.

Step by step

First, when a camera image comes in, the Google Translate app has to find the letters in the picture. It needs to weed out background objects like trees or cars, and pick up on the words we want translated. It looks at blobs of pixels that have similar color to each other that are also near other similar blobs of pixels. Those are possibly letters, and if they’re near each other, that makes a continuous line we should read.
Second, Translate has to recognize what each letter actually is. This is where deep learning comes in. We use a convolutional neural network, training it on letters and non-letters so it can learn what different letters look like.

But interestingly, if we train just on very “clean”-looking letters, we risk not understanding what real-life letters look like. Letters out in the real world are marred by reflections, dirt, smudges, and all kinds of weirdness. So we built our letter generator to create all kinds of fake “dirt” to convincingly mimic the noisiness of the real world—fake reflections, fake smudges, fake weirdness all around.

Why not just train on real-life photos of letters? Well, it’s tough to find enough examples in all the languages we need, and it’s harder to maintain the fine control over what examples we use when we’re aiming to train a really efficient, compact neural network. So it’s more effective to simulate the dirt.
Some of the “dirty” letters we use for training. Dirt, highlights, and rotation, but not too much because we don’t want to confuse our neural net.
The third step is to take those recognized letters, and look them up in a dictionary to get translations. Since every previous step could have failed in some way, the dictionary lookup needs to be approximate. That way, if we read an ‘S’ as a ‘5’, we’ll still be able to find the word ‘5uper’.

Finally, we render the translation on top of the original words in the same style as the original. We can do this because we’ve already found and read the letters in the image, so we know exactly where they are. We can look at the colors surrounding the letters and use that to erase the original letters. And then we can draw the translation on top using the original foreground color.

Crunching it down for mobile

Now, if we could do this visual translation in our data centers, it wouldn’t be too hard. But a lot of our users, especially those getting online for the very first time, have slow or intermittent network connections and smartphones starved for computing power. These low-end phones can be about 50 times slower than a good laptop—and a good laptop is already much slower than the data centers that typically run our image recognition systems. So how do we get visual translation on these phones, with no connection to the cloud, translating in real-time as the camera moves around?

We needed to develop a very small neural net, and put severe limits on how much we tried to teach it—in essence, put an upper bound on the density of information it handles. The challenge here was in creating the most effective training data. Since we’re generating our own training data, we put a lot of effort into including just the right data and nothing more. For instance, we want to be able to recognize a letter with a small amount of rotation, but not too much. If we overdo the rotation, the neural network will use too much of its information density on unimportant things. So we put effort into making tools that would give us a fast iteration time and good visualizations. Inside of a few minutes, we can change the algorithms for generating training data, generate it, retrain, and visualize. From there we can look at what kind of letters are failing and why. At one point, we were warping our training data too much, and ‘$’ started to be recognized as ‘S’. We were able to quickly identify that and adjust the warping parameters to fix the problem. It was like trying to paint a picture of letters that you’d see in real life with all their imperfections painted just perfectly.

To achieve real-time, we also heavily optimized and hand-tuned the math operations. That meant using the mobile processor’s SIMD instructions and tuning things like matrix multiplies to fit processing into all levels of cache memory.

In the end, we were able to get our networks to give us significantly better results while running about as fast as our old system—great for translating what you see around you on the fly. Sometimes new technology can seem very abstract, and it's not always obvious what the applications for things like convolutional neural nets could be. We think breaking down language barriers is one great use.

The Thorny Issue of CS Teacher Certification



(Cross-posted on the Google for Education Blog)

There is a tremendous focus on computer science education in K-12. Educators, policy makers, the non-profit sector and industry are sharing a common message about the benefits of computer science knowledge and the opportunities it provides. In this wider effort to improve access to computer science education, one of the challenges we face is how to ensure that there is a pipeline of computer science teachers to meet the growing demand for this expertise in schools.

In 2013 the Computer Science Teachers Association (CSTA) released Bugs in the System: Computer Science Teacher Certification in the U.S. Based on 18 months of intensive Google-funded research, this report characterized the current state of teacher certification as being rife with “bugs in the system” that prevent it from functioning as intended. Examples of current challenges included states where someone with no knowledge of computer science can teach it, states where the requirements for teacher certification are impossible to meet, and states where certification administrators are confused about what computer science is. The report also demonstrated that this is actually a circular problem - States are hesitant to require certification when they have no programs to train the teachers, and teacher training programs are hesitant to create programs for which there is no clear certification pathway.
Addressing the issues with the current teacher preparation and certification system is a complex challenge and it requires the commitment of the entire computer science community. Fortunately, some of this work is already underway. CSTA’s report provides a set of recommendations aimed at addressing these issues. Educators, advocates, and policymakers are also beginning to examine their systems and how to reform them.

Google is also exploring how we might help. We convened a group of teacher preparation faculty, researchers, and administrators from across the country to brainstorm how we might work with teacher preparation programs to support the inclusion of computational thinking into teacher preparation programs. As a result of this meeting, Dr. Aman Yadav, Professor of Educational Psychology and Educational Technology at Michigan State University, is now working on two research articles aimed at helping teacher preparation program leaders better understand what computational thinking is, and how it supports learning across multiple disciplines.

Google will also be launching a new online course called Computational Thinking for Educators. In this free course, educators working with students between the ages of 13 and 18 will learn how incorporating computational thinking can enhance and enrich learning in diverse academic disciplines and can help boost students’ confidence when dealing with ambiguous, complex or open-ended problems. The course will run from July 15 to September 30, 2015.

These kind of community partnerships are one way that Google can contribute to practitioner-centered solutions and help further the computer science education community’s efforts to help everyone understand that computer science is a deeply important academic discipline that deserves a place in the K-12 canon and well-prepared teachers to share this knowledge with students.

Should My Kid Learn to Code?



(Cross-posted on the Google for Education Blog)

Over the last few years, successful marketing campaigns such as Hour of Code and Made with Code have helped K12 students become increasingly aware of the power and relevance of computer programming across all fields. In addition, there has been growth in developer bootcamps, online “learn to code” programs (code.org, CS First, Khan Academy, Codecademy, Blockly Games, etc.), and non-profits focused specifically on girls and underrepresented minorities (URMs) (Technovation, Girls who Code, Black Girls Code, #YesWeCode, etc.).

This is good news, as we need many more computing professionals than are currently graduating from Computer Science (CS) and Information Technology (IT) programs. There is evidence that students are starting to respond positively too, given undergraduate departments are experiencing capacity issues in accommodating all the students who want to study CS.

Most educators agree that basic application and internet skills (typing, word processing, spreadsheets, web literacy and safety, etc.) are fundamental, and thus, “digital literacy” is a part of K12 curriculum. But is coding now a fundamental literacy, like reading or writing, that all K12 students need to learn as well?

In order to gain a deeper understanding of the devices and applications they use everyday, it’s important for all students to try coding. In doing so, this also has the positive effect of inspiring more potential future programmers. Furthermore, there are a set of relevant skills, often consolidated as “computational thinking”, that are becoming more important for all students, given the growth in the use of computers, algorithms and data in many fields. These include:
  • Abstraction, which is the replacement of a complex real-world situation with a simple model within which we can solve problems. CS is the science of abstraction: creating the right model for a problem, representing it in a computer, and then devising appropriate automated techniques to solve the problem within the model. A spreadsheet is an abstraction of an accountant’s worksheet; a word processor is an abstraction of a typewriter; a game like Civilization is an abstraction of history.
  • An algorithm is a procedure for solving a problem in a finite number of steps that can involve repetition of operations, or branching to one set of operations or another based on a condition. Being able to represent a problem-solving process as an algorithm is becoming increasingly important in any field that uses computing as a primary tool (business, economics, statistics, medicine, engineering, etc.). Success in these fields requires algorithm design skills.
  • As computers become essential in a particular field, more domain-specific data is collected, analyzed and used to make decisions. Students need to understand how to find the data; how to collect it appropriately and with respect to privacy considerations; how much data is needed for a particular problem; how to remove noise from data; what techniques are most appropriate for analysis; how to use an analysis to make a decision; etc. Such data skills are already required in many fields.
These computational thinking skills are becoming more important as computers, algorithms and data become ubiquitous. Coding will also become more common, particularly with the growth in the use of visual programming languages, like Blockly, that remove the need to learn programming language syntax, and via custom blocks, can be used as an abstraction for many different applications.

One way to represent these different skill sets and the students who need them is as follows:
All students need digital literacy, many need computational thinking depending on their career choice, and some will actually do the software development in high-tech companies, IT departments, or other specialized areas. I don’t believe all kids should learn to code seriously, but all kids should try it via programs like code.org, CS First or Khan Academy. This gives students a good introduction to computational thinking and coding, and provides them with a basis for making an informed decision on whether CS or IT is something they wish to pursue as a career.

Simulating fermionic particles with superconducting quantum hardware



Digital quantum simulation is one of the key applications of a future, viable quantum computer. Researchers around the world hope that quantum computing will not only be able to process certain calculations faster than any classical computer, but also help simulate nature more accurately and answer longstanding questions with regard to high temperature superconductivity, complex quantum materials, and applications in quantum chemistry.

A crucial part in describing nature is simulating electrons. Without electrons, you cannot describe metals and their conductivity, or the interatomic bonds which hold molecules together. But simulating systems with many electrons makes for a very tough problem on classical computers, due to some of their peculiar quantum properties.

Electrons are fermionic particles, and as such obey the well-known Pauli exclusion principle which states that no fermions in a system can occupy the same quantum state. This is due to a property called anticommutation, an inherent quantum mechanical behavior of all fermions, that makes it very tricky to fully simulate anything that is composed of complex interactions between electrons. The upshot of this anticommutative property is that if you have identical electrons, one at position A and another at position B, and you swap them, you end up with a different quantum state. If your simulation has many electrons you need to carefully keep track of these changes, while ensuring all the interactions between electrons can be completely, yet separately tunable.

Add to that the memory errors caused by fluctuation or noise from their environment and the fact that quantum physics prevents one from directly monitoring the superconducting quantum bits (“qubits”) of a quantum computer directly to account for those errors, and you've got your hands full. However, earlier this year we reported on some exciting steps towards Quantum Error Correction - as it turns out, the hardware we built isn't only useable for error correction, but can also be used for quantum simulation.

In Digital quantum simulation of fermionic models with a superconducting circuit, published in Nature Communications, we present digital methods that enable the simulation of the complex interactions between fermionic particles, by using single-qubit and two-qubit quantum logic gates as building blocks. And with the recent advances in hardware and control we can now implement them.

We took our qubits and made them act like interacting fermions. We experimentally verified that the simulated particles anticommute, and implemented static and time-varying models. With over 300 logic gates, it is the largest digital quantum simulation to date, and the first implementation in a solid-state device.
Left: Model picture with four fermionic modes in two sites. The modes are occupied or unoccupied. For example, we can start with two fermionic particles in the right well, by occupying the blue and green mode. If the particles repel each other, there's a good chance that one of the them will hop to the left well through the process of quantum tunneling through the barrier. It will then occupy the red or purple mode. This interplay of on-site interaction and hopping lies at the core of describing processes in physics and chemistry, ranging from the conductivity of metals to the binding between atoms in molecules. Right: The false-colored cross-shaped structures are the superconducting quantum bits. The colors correspond to the modes, so if we have two fermionic particles in the blue and red modes, the rightmost two quantum bits are excited.
Coming up with an efficient sequence of logic gates that can accurately model the interactions for systems of fermions wasn’t easy. So we teamed up with Dr. Lucas Lamata, M.Sc. Laura García-Álvarez, and Prof. Enrique Solano from the QUTIS group at the University of the Basque Country (UPV/EHU) in Bilbao, Spain, who are experts in constructing algorithms and translating them into the streams of logic gates we can implement with our hardware.

For the future, digital quantum simulation holds the promise that it can be run on an error-corrected quantum computer. But before that, we foresee the construction of larger testbeds for simulation with improvements in logic gates and architecture. This experiment is a critical step on the path to creating a quantum simulator capable of modeling fermions as well as bosons (particles which can be interchanged, as opposed to fermions), opening up exciting possibilities for simulating physical and chemical processes in nature.