Category Archives: Computer

Contest for her work fighting bias in machine

When Joy Buolamwini, an MIT master’s candidate in media arts and sciences, sits in front a mirror, she sees a black woman in her 20s. But when her photo is run through recognition software, it does not recognize her face. A seemingly neutral machine programmed with algorithms-codified processes simply fails to detect her features. Buolamwini is, she says, “on the wrong side of computational decisions” that can lead to exclusionary and discriminatory practices and behaviors in society.

That phenomenon, which Buolamwini calls the “coded gaze,”  is what motivated her late last year to launch the Algorithmic Justice League (AJL) to highlight such bias through provocative media and interactive exhibitions; to provide space for people to voice concerns and experiences with coded discrimination; and to develop practices for accountability during the design, development, and deployment phases of coded systems.

That work is what contributed to the Media Lab student earning the grand prize in the professional category of The Search for Hidden Figures. The nationwide contest, created by PepsiCo and 21st Century Fox in partnership with the New York Academy of Sciences, is named for a recently released film that tells the real-life story of three African-American women at NASA whose math brilliance helped launch the United States into the space race in the early 1960s.

“I’m honored to receive this recognition, and I’ll use the prize to continue my mission to show compassion through computation,” says Buolamwini, who was born in Canada, then lived in Ghana and, at the age of four, moved to Oxford, Mississippi. She’s a two-time recipient of an Astronaut Scholarship in a program established by NASA’s Mercury 7 crew members, including late astronaut John Glenn, who are depicted in the film “Hidden Figures.”

The film had a big impact on Buolamwini when she saw a special MIT sneak preview in early December: “I witnessed the power of storytelling to change cultural perceptions by highlighting hidden truths. After the screening where I met Margot Lee Shetterly, who wrote the book on which the film is based, I left inspired to tell my story, and applied for the contest. Being selected as a grand prize winner provides affirmation that pursuing STEM is worth celebrating. And it’s an important reminder to share the stories of discriminatory experiences that necessitate the Algorithmic Justice League as well as the uplifting stories of people who come together to create a world where technology can work for all of us and drive social change.”

The Search for Hidden Figures contest attracted 7,300 submissions from students across the United States. As one of two grand prize winners, Buolamwini receives a $50,000 scholarship, a trip to the Kennedy Space Center in Florida, plus access to New York Academy of Sciences training materials and programs in STEM. She plans to use the prize resources to develop what she calls “bias busting” tools to help defeat bias in machine learning.

That is the focus of her current research at the MIT Media Lab, where Buolamwini is in the Civic Media group pursuing a master’s degree with an eye toward a PhD. “The Media Lab serves as a unifying thread in my journey in STEM. Until I saw the lab on TV, I didn’t realize there was a place dedicated to exploring the future of humanity and technology by allowing us to indulge our imaginations by continuously asking, ‘What if?'”

Learns to recognize sounds

In recent years, computers have gotten remarkably good at recognizing speech and images: Think of the dictation software on most cellphones, or the algorithms that automatically identify people in photos posted to Facebook.

But recognition of natural sounds — such as crowds cheering or waves crashing — has lagged behind. That’s because most automated recognition systems, whether they process audio or visual information, are the result of machine learning, in which computers search for patterns in huge compendia of training data. Usually, the training data has to be first annotated by hand, which is prohibitively expensive for all but the highest-demand applications.

Sound recognition may be catching up, however, thanks to researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). At the Neural Information Processing Systems conference next week, they will present a sound-recognition system that outperforms its predecessors but didn’t require hand-annotated data during training.

Instead, the researchers trained the system on video. First, existing computer vision systems that recognize scenes and objects categorized the images in the video. The new system then found correlations between those visual categories and natural sounds.

“Computer vision has gotten so good that we can transfer it to other domains,” says Carl Vondrick, an MIT graduate student in electrical engineering and computer science and one of the paper’s two first authors. “We’re capitalizing on the natural synchronization between vision and sound. We scale up with tons of unlabeled video to learn to understand sound.”

The researchers tested their system on two standard databases of annotated sound recordings, and it was between 13 and 15 percent more accurate than the best-performing previous system. On a data set with 10 different sound categories, it could categorize sounds with 92 percent accuracy, and on a data set with 50 categories it performed with 74 percent accuracy. On those same data sets, humans are 96 percent and 81 percent accurate, respectively.

“Even humans are ambiguous,” says Yusuf Aytar, the paper’s other first author and a postdoc in the lab of MIT professor of electrical engineering and computer science Antonio Torralba. Torralba is the final co-author on the paper.

“We did an experiment with Carl,” Aytar says. “Carl was looking at the computer monitor, and I couldn’t see it. He would play a recording and I would try to guess what it was. It turns out this is really, really hard. I could tell indoor from outdoor, basic guesses, but when it comes to the details — ‘Is it a restaurant?’ — those details are missing. Even for annotation purposes, the task is really hard.”

Complementary modalities

Because it takes far less power to collect and process audio data than it does to collect and process visual data, the researchers envision that a sound-recognition system could be used to improve the context sensitivity of mobile devices.

When coupled with GPS data, for instance, a sound-recognition system could determine that a cellphone user is in a movie theater and that the movie has started, and the phone could automatically route calls to a prerecorded outgoing message. Similarly, sound recognition could improve the situational awareness of autonomous robots.

“For instance, think of a self-driving car,” Aytar says. “There’s an ambulance coming, and the car doesn’t see it. If it hears it, it can make future predictions for the ambulance — which path it’s going to take — just purely based on sound.”

The intersection of policy and technology

“When you’re part of a community, you want to leave it better than you found it,” says Keertan Kini, an MEng student in the Department of Electrical Engineering, or Course 6. That philosophy has guided Kini throughout his years at MIT, as he works to improve policy both inside and out of MIT.

As a member of the Undergraduate Student Advisory Group, former chair of the Course 6 Underground Guide Committee, member of the Internet Policy Research Initiative (IPRI), and of the Advanced Network Architecture group, Kini’s research focus has been in finding ways that technology and policy can work together. As Kini puts it, “there can be unintended consequences when you don’t have technology makers who are talking to policymakers and you don’t have policymakers talking to technologists.” His goal is to allow them to talk to each other.

At 14, Kini first started to get interested in politics. He volunteered for President Obama’s 2008 campaign, making calls and putting up posters. “That was the point I became civically engaged,” says Kini. After that, he was campaigning for a ballot initiative to raise more funding for his high school, and he hasn’t stopped being interested in public policy since.

High school was also where Kini became interested in computer science. He took a computer science class in high school on the recommendation of his sister, and in his senior year, he started watching computer science lectures on MIT OpenCourseWare (OCW) by Hal Abelson, a professor in MIT’s Department of Electrical Engineering and Computer Science.

“That lecture reframed what computer science was. I loved it,” Kini recalls. “The professor said ‘it’s not about computers, and it’s not about science’. It might be an art or engineering, but it’s not science, because what we’re working with are idealized components, and ultimately the power of what we can actually achieve with them is not based so much on physical limitations so much as the limitations of the mind.”

In part thanks to Abelson’s OCW lectures, Kini came to MIT to study electrical engineering and computer science. Kini is currently pursuing an MEng in electrical engineering and computer science, a fifth-year master’s program following his undergraduate studies in electrical engineering and computer science.

Combining two disciplines

Kini set his policy interest to the side his freshman year, until he took 6.805J (Foundations of Information Policy), with Abelson, the same professor who inspired Kini to study computer science. After taking Abelson’s course, Kini joined him and Daniel Weitzner, a principal research scientist in the Computer Science and Artificial Intelligence Laboratory, in putting together a big data and privacy workshop for the White House in the wake of the Edward Snowden leak of classified information from the National Security Agency. Four years later, Kini is now a teaching assistant for 6.805J.

With Weitzner as his advisor, Kini went on to work on a SuperUROP, an advanced version of the Undergraduate Research Opportunities Program in which students take on their own research project for a full year. Kini’s project focused on making it easier for organizations that had experienced a cybersecurity breach to share how the breach happened with other organizations, without accidentally sharing private or confidential information as well.

The target technique

When it comes to protecting data from cyberattacks, information technology (IT) specialists who defend computer networks face attackers armed with some advantages. For one, while attackers need only find one vulnerability in a system to gain network access and disrupt, corrupt, or steal data, the IT personnel must constantly guard against and work to mitigate varied and myriad network intrusion attempts.

The homogeneity and uniformity of software applications have traditionally created another advantage for cyber attackers. “Attackers can develop a single exploit against a software application and use it to compromise millions of instances of that application because all instances look alike internally,” says Hamed Okhravi, a senior staff member in the Cyber Security and Information Sciences Division at MIT Lincoln Laboratory. To counter this problem, cybersecurity practitioners have implemented randomization techniques in operating systems. These techniques, notably address space layout randomization (ASLR), diversify the memory locations used by each instance of the application at the point at which the application is loaded into memory.

In response to randomization approaches like ASLR, attackers developed information leakage attacks, also called memory disclosure attacks. Through these software assaults, attackers can make the application disclose how its internals have been randomized while the application is running. Attackers then adjust their exploits to the application’s randomization and successfully hijack control of vulnerable programs. “The power of such attacks has ensured their prevalence in many modern exploit campaigns, including those network infiltrations in which an attacker remains undetected and continues to steal data in the network for a long time,” explains Okhravi, who adds that methods for bypassing ASLR, which is currently deployed in most modern operating systems, and similar defenses can be readily found on the Internet.

Okhravi and colleagues David Bigelow, Robert Rudd, James Landry, and William Streilein, and former staff member Thomas Hobson, have developed a unique randomization technique, timely address space randomization (TASR), to counter information leakage attacks that may thwart ASLR protections. “TASR is the first technology that mitigates an attacker’s ability to leverage information leakage against ASLR, irrespective of the mechanism used to leak information,” says Rudd.

To disallow an information leakage attack, TASR immediately rerandomizes the memory’s layout every time it observes an application processing an output and input pair. “Information may leak to the attacker on any given program output without anybody being able to detect it, but TASR ensures that the memory layout is rerandomized before the attacker has an opportunity to act on that stolen information, and hence denies them the opportunity to use it to bypass operating system defenses,” says Bigelow. Because TASR’s rerandomization is based upon application activity and not upon a set timing (say every so many minutes), an attacker cannot anticipate the interval during which the leaked information might be used to send an exploit to the application before randomization recurs.

Automated speech recognition

Speech recognition systems, such as those that convert speech to text on cellphones, are generally the result of machine learning. A computer pores through thousands or even millions of audio files and their transcriptions, and learns which acoustic features correspond to which typed words.

But transcribing recordings is costly, time-consuming work, which has limited speech recognition to a small subset of languages spoken in wealthy nations.

At the Neural Information Processing Systems conference this week, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are presenting a new approach to training speech-recognition systems that doesn’t depend on transcription. Instead, their system analyzes correspondences between images and spoken descriptions of those images, as captured in a large collection of audio recordings. The system then learns which acoustic features of the recordings correlate with which image characteristics.

“The goal of this work is to try to get the machine to learn language more like the way humans do,” says Jim Glass, a senior research scientist at CSAIL and a co-author on the paper describing the new system. “The current methods that people use to train up speech recognizers are very supervised. You get an utterance, and you’re told what’s said. And you do this for a large body of data.

“Big advances have been made — Siri, Google — but it’s expensive to get those annotations, and people have thus focused on, really, the major languages of the world. There are 7,000 languages, and I think less than 2 percent have ASR [automatic speech recognition] capability, and probably nothing is going to be done to address the others. So if you’re trying to think about how technology can be beneficial for society at large, it’s interesting to think about what we need to do to change the current situation. And the approach we’ve been taking through the years is looking at what we can learn with less supervision.”

Technique shrinks data sets while preserving their fundamental

One way to handle big data is to shrink it. If you can identify a small subset of your data set that preserves its salient mathematical relationships, you may be able to perform useful analyses on it that would be prohibitively time consuming on the full set.

The methods for creating such “coresets” vary according to application, however. Last week, at the Annual Conference on Neural Information Processing Systems, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory and the University of Haifa in Israel presented a new coreset-generation technique that’s tailored to a whole family of data analysis tools with applications in natural-language processing, computer vision, signal processing, recommendation systems, weather prediction, finance, and neuroscience, among many others.

“These are all very general algorithms that are used in so many applications,” says Daniela Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT and senior author on the new paper. “They’re fundamental to so many problems. By figuring out the coreset for a huge matrix for one of these tools, you can enable computations that at the moment are simply not possible.”

As an example, in their paper the researchers apply their technique to a matrix — that is, a table — that maps every article on the English version of Wikipedia against every word that appears on the site. That’s 1.4 million articles, or matrix rows, and 4.4 million words, or matrix columns.

That matrix would be much too large to analyze using low-rank approximation, an algorithm that can deduce the topics of free-form texts. But with their coreset, the researchers were able to use low-rank approximation to extract clusters of words that denote the 100 most common topics on Wikipedia. The cluster that contains “dress,” “brides,” “bridesmaids,” and “wedding,” for instance, appears to denote the topic of weddings; the cluster that contains “gun,” “fired,” “jammed,” “pistol,” and “shootings” appears to designate the topic of shootings.

Joining Rus on the paper are Mikhail Volkov, an MIT postdoc in electrical engineering and computer science, and Dan Feldman, director of the University of Haifa’s Robotics and Big Data Lab and a former postdoc in Rus’s group.

The researchers’ new coreset technique is useful for a range of tools with names like singular-value decomposition, principal-component analysis, and latent semantic analysis. But what they all have in common is dimension reduction: They take data sets with large numbers of variables and find approximations of them with far fewer variables.

In this, these tools are similar to coresets. But coresets are application-specific, while dimension-reduction tools are general-purpose. That generality makes them much more computationally intensive than coreset generation — too computationally intensive for practical application to large data sets.

System finds and links related data

The age of big data has seen a host of new techniques for analyzing large data sets. But before any of those techniques can be applied, the target data has to be aggregated, organized, and cleaned up.

That turns out to be a shockingly time-consuming task. In a 2016 survey, 80 data scientists told the company CrowdFlower that, on average, they spent 80 percent of their time collecting and organizing data and only 20 percent analyzing it.

An international team of computer scientists hopes to change that, with a new system called Data Civilizer, which automatically finds connections among many different data tables and allows users to perform database-style queries across all of them. The results of the queries can then be saved as new, orderly data sets that may draw information from dozens or even thousands of different tables.

“Modern organizations have many thousands of data sets spread across files, spreadsheets, databases, data lakes, and other software systems,” says Sam Madden, an MIT professor of electrical engineering and computer science and faculty director of MIT’s bigdata@CSAIL initiative. “Civilizer helps analysts in these organizations quickly find data sets that contain information that is relevant to them and, more importantly, combine related data sets together to create new, unified data sets that consolidate data of interest for some analysis.”

The researchers presented their system last week at the Conference on Innovative Data Systems Research. The lead authors on the paper are Dong Deng and Raul Castro Fernandez, both postdocs at MIT’s Computer Science and Artificial Intelligence Laboratory; Madden is one of the senior authors. They’re joined by six other researchers from Technical University of Berlin, Nanyang Technological University, the University of Waterloo, and the Qatar Computing Research Institute. Although he’s not a co-author, MIT adjunct professor of electrical engineering and computer science Michael Stonebraker, who in 2014 won the Turing Award — the highest honor in computer science — contributed to the work as well.

Pairs and permutations

Data Civilizer assumes that the data it’s consolidating is arranged in tables. As Madden explains, in the database community, there’s a sizable literature on automatically converting data to tabular form, so that wasn’t the focus of the new research. Similarly, while the prototype of the system can extract tabular data from several different types of files, getting it to work with every conceivable spreadsheet or database program was not the researchers’ immediate priority. “That part is engineering,” Madden says.

Unexpected career path in the medical field

During January of her junior year at MIT, Caroline Colbert chose to do a winter externship at Massachusetts General Hospital (MGH). Her job was to shadow the radiation oncology staff, including the doctors that care for patients and medical physicists that design radiation treatment plans.

Colbert, now a senior in the Department of Nuclear Science and Engineering (NSE), had expected to pursue a career in nuclear power. But after working in a medical environment, she changed her plans.

She stayed at MGH to work on building a model to automate the generation of treatment plans for patients who will undergo a form of radiation therapy called volumetric-modulated arc therapy (VMAT). The work was so interesting that she is still involved with it and has now decided to pursue a doctoral degree in medical physics, a field that allows her to blend her training in nuclear science and engineering with her interest in medical technologies.

She’s even zoomed in on schools with programs that have accreditation from the Commission on Accreditation of Medical Physics Graduate Programs so she’ll have the option of having a more direct impact on patients. “I don’t know yet if I’ll be more interested in clinical work, research, or both,” she says. “But my hope is to work in a hospital setting.”

Many NSE students and faculty focus on nuclear energy technologies. But, says Colbert, “the department is really supportive of students who want to go into other industries.”

It was as a middle school student that Colbert first became interested in engineering. Later, in a chemistry class, a lesson about nuclear decay set her on a path towards nuclear science and engineering. “I thought it was so cool that one element can turn into another,” she says. “You think of elements as the fundamental building blocks of the physical world.”

Colbert’s parents, both from the Boston area, had encouraged her to apply to MIT. They also encouraged her towards the medical field. “They loved the idea of me being a doctor, and then when I decided on nuclear engineering, they wanted me to look into medical physics,” she says. “I was trying to make my own way. But when I did look seriously into medical physics, I had to admit that my parents were right.”

At MGH, Colbert’s work began with searching for practical ways to improve the generation of VMAT treatment plans. As with another form of radiation therapy called intensity-modulated radiation therapy (IMRT), the technology focuses radiation doses on the tumor and away from the healthy tissue surrounding it. The more accurate the dosing, the fewer side effects patients have after therapy.

Light on purpose of inhibitory neurons

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have developed a new computational model of a neural circuit in the brain, which could shed light on the biological role of inhibitory neurons — neurons that keep other neurons from firing.

The model describes a neural circuit consisting of an array of input neurons and an equivalent number of output neurons. The circuit performs what neuroscientists call a “winner-take-all” operation, in which signals from multiple input neurons induce a signal in just one output neuron.

Using the tools of theoretical computer science, the researchers prove that, within the context of their model, a certain configuration of inhibitory neurons provides the most efficient means of enacting a winner-take-all operation. Because the model makes empirical predictions about the behavior of inhibitory neurons in the brain, it offers a good example of the way in which computational analysis could aid neuroscience.

The researchers will present their results this week at the conference on Innovations in Theoretical Computer Science. Nancy Lynch, the NEC Professor of Software Science and Engineering at MIT, is the senior author on the paper. She’s joined by Merav Parter, a postdoc in her group, and Cameron Musco, an MIT graduate student in electrical engineering and computer science.

For years, Lynch’s group has studied communication and resource allocation in ad hoc networks — networks whose members are continually leaving and rejoining. But recently, the team has begun using the tools of network analysis to investigate biological phenomena.

“There’s a close correspondence between the behavior of networks of computers or other devices like mobile phones and that of biological systems,” Lynch says. “We’re trying to find problems that can benefit from this distributed-computing perspective, focusing on algorithms for which we can prove mathematical properties.”

Artificial neurology

In recent years, artificial neural networks — computer models roughly based on the structure of the brain — have been responsible for some of the most rapid improvement in artificial-intelligence systems, from speech transcription to face recognition software.

An artificial neural network consists of “nodes” that, like individual neurons, have limited information-processing power but are densely interconnected. Data are fed into the first layer of nodes. If the data received by a given node meet some threshold criterion — for instance, if it exceeds a particular value — the node “fires,” or sends signals along all of its outgoing connections.

How about computers that explain themselves

Machines that predict the future, robots that patch wounds, and wireless emotion-detectors are just a few of the exciting projects that came out of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) this year. Here’s a sampling of 16 highlights from 2016 that span the many computer science disciplines that make up CSAIL.

Robots for exploring Mars — and your stomach

  • A team led by CSAIL director Daniela Rus developed an ingestible origami robot that unfolds in the stomach to patch wounds and remove swallowed batteries.
  • Researchers are working on NASA’s humanoid robot, “Valkyrie,” who will be programmed for trips into outer space and to autonomously perform tasks.
  • A 3-D printed robot was made of both solids and liquids and printed in one single step, with no assembly required.

Keeping data safe and secure

  • CSAIL hosted a cyber summit that convened members of academia, industry, and government, including featured speakers Admiral Michael Rogers, director of the National Security Agency; and Andrew McCabe, deputy director of the Federal Bureau of Investigation.
  • Researchers came up with a system for staying anonymous online that uses less bandwidth to transfer large files between anonymous users.
  • A deep-learning system called AI2 was shown to be able to predict 85 percent of cyberattacks with the help of some human input.

Advancements in computer vision

  • A new imaging technique called Interactive Dynamic Video lets you reach in and “touch” objects in videos using a normal camera.
  • Researchers from CSAIL and Israel’s Weizmann Institute of Science produced a movie display called Cinema 3D that uses special lenses and mirrors to allow viewers to watch 3-D movies in a theater without having to wear those clunky 3-D glasses.
  • A new deep-learning algorithm can predict human interactions more accurately than ever before, by training itself on footage from TV shows like “Desperate Housewives” and “The Office.”
  • A group from MIT and Harvard University developed an algorithm that may help astronomers produce the first image of a black hole, stitching together telescope data to essentially turn the planet into one large telescope dish.

Tech to help with health

  • A team produced a robot that can help schedule and assign tasks by learning from humans, in fields like medicine and the military.
  • Researchers came up with an algorithm for identifying organs in fetal MRI scans to extensively evaluate prenatal health.
  • A wireless device called EQ-Radio can tell if you’re excited, happy, angry, or sad, by measuring breathing and heart rhythms.