Two hundred years ago the French mathematician Pierre-Simon Laplace, a man famous for his pioneering work in the field of statistics, commented:
“... [The] Mind that in a given instance knew all the forces by which nature is animated and the position of all the bodies of which it is composed, if it were vast enough to include all these data within his analysis, could embrace in one single formula the movements of the largest bodies of the universe and of the smallest atoms; nothing would be uncertain for him; the future and the past would be equally before his eyes.”
Many readers will, of course, recognize this once metaphysical idea as the concept behind today’s recommender systems: You know, those “you enjoyed X, why not try Y?” suggestions used by online retailers like Amazon. By analyzing our past behavior--along with that of users who have expressed similar preferences (the so-called “K-nearest neighbor” algorithm for collaborative filtering)--recommender systems provide a neat and sometimes alarmingly prescient way of predicting what we are likely to be interested in--even before we have come across a particular item.
It would be a mistake, of course, to imagine that such technology is relevant only in the world of Internet retail. Technological development has long been linked with universities and academia; the first music recommender system was developed at MIT in the early 1990s. This trend continues today. Earlier this month, U.K.-based company Mendeley celebrated the connection by staging a mini-conference in London, England on the subject of Academic-Industrial Collaborations for Recommender Systems, with seven presentations delivered by eight different speakers that revealed a plethora of ways in which recommender systems are helping revolutionize the world of academia.
The Search For Accurate Academic Recommendations
FastCo.Labs also had the opportunity to speak with Mendeley’s Chief data scientist, Kris Jack, about the pioneering work his company is engaged in.
“Academics are busy people,” he says. “They require time to be able to carry out their research, connect with people, collaborate with colleagues, plan and stage experiments, and analyze the results. As they are doing this, our systems monitor the different signals that are sent out, in the least intrusive way possible, and build intelligent links between them so that we can make recommendations that are useful and make sense.”
It was the English mathematician and philosopher Alfred North Whitehead who observed that society “advances by adding to the number of important operations which we can perform without thinking about them.” In essence, this is the central concept of a recommender system. Unlike search engines such as Google, recommender systems do not require specific search terms to be entered; instead they glean from user data the information that is likely to be of interest and pull it to one side.
The impediment, of course, is working out what exactly that information should be. While customers in online retail can sometimes be grateful for whatever suggestions are made, in the case of academics, companies like Mendeley are building a discovery tool aimed at people well-versed in research. It’s the computational version of trying to sell water to a well.
“If you have a very experienced researcher you do not want to keep recommending information to them that is very fundamental,” Jack says. “If our system sees that users show a deep understanding of the academic papers which have had particular impact in an area, we want to be able to understand in as accurate a sense as possible exactly what it is that they are trying to research so that we can best model their informational needs.”
There is also the added challenge of avoiding filter bubbles. “The idea of serendipity is very important,” Jack says. “One of our real strengths, I think, is the ability to introduce a bit of novelty so that the information we are presenting is not the same as what researchers would necessarily have been able to access themselves had they done the searching themselves in the area they are working in. We want to be able to create links that are not obvious--and ones that would not have been able to be found using a traditional search engine.”
Forget Indexing Books And Articles--Let’s Index Ideas
According to the Internet, the phrase “So many books, so little time” can be attributed to none other than the late American musician and polymath Frank Zappa. From the earliest days of organized knowledge, philosophers, scientists, and anyone else with a vested interest in information have fretted about man’s inability to read and absorb every bit of data that is produced. This situation is only exacerbated by the Internet, which not only creates more information by making everyone into a joint author/publisher, but also opens up our ability to access whatever information is out there.
In much the same way that iTunes helped teach us that the correct unitary measure of music is not the album but the single, companies such as Mendeley are demonstrating that the proper unitary measurement for the academic recommender systems of the future is not simply books or articles, but ideas.
“Every research article that gets written is packed full of different ideas,” Jack says. “When people are reading documents we have the ability to drill down to see whether they spend more time on particular pages, or perhaps highlight certain sentences that are of special interest. Our recommender systems then allow us to say that of the papers people are reading in a particular network you might be part of, such-and-such an idea is the one that gets the most attention paid to it.”
The ability to operate at this kind of granular level not only allows research to be carried out quicker--saving the time it would take someone to scan through a publication for the one or two lines most relevant to their work--but also opens up exciting new possibilities. Imagine, for instance, that based on the ideas a recommender systems knows are of interest to you, or else the research goals you have specified, it could then match you up with other researchers for possible fruitful collaboration. Companies could similarly use this technology to recommend particular employees (in the way that Amazon recommends books, or Netflix recommends movies) as the person to speak to about a specific problem, thus stripping away unnecessary levels of bureaucracy.
More enticing still is the potential for cross-pollination across academic subjects. “It may be that there is a question in computer science, described using a certain terminology, that no one has the solution for,” Jack points out. “That exact same problem might be being addressed using completely different words in the field of physics--where they do have the answer. The question that we are dealing with is how exactly do we create these links so that we can understand that these two problems, despite being described in different disciplines using different words, are actually the same problem?”
Back in the 1960s, Xerox PARC researcher Alan Kay dreamed of transforming the computer into the “meta-medium” that would encompass every other media, from typing to video editing. Half a century on, recommender systems might be doing much the same for the gap between disciplines.
Other Highlights From Academic-Industrial Collaborations for Recommender Systems
Former Microsoft Xbox recommender system researcher Jagadeesh Gorla began the day by delivering a presentation entitled “A Bi-directional Unified Model,” in which he described group recommendations using a new probabilistic model based on ideas from the field of Information Retrieval, which learns probabilities expressing the match between arbitrary user and item features.
Double act Nikos Manouselis and Christoph Trattner then took to the stage to discuss the opportunities and challenges presented by academic-industrial collaborations: giving honest and candid reflections from both sides of the fence.
Heimo Gursch delivered his “Thoughts on Access Control in Enterprise Recommender Systems,” describing a system that enables employees within a company to effectively share their access control rights with one another, rather than relying on top-down authority to provide them.
Maciej Dabrowski discussed his recent work in a presentation called “Towards Near Real-Time Social Recommendations in an Enterprise,” which explained recommender systems that exploit semantic data from linked data repositories to generate recommendations across multiple domains.
Benjamin Habegger gave a roller-coaster ride of a talk in which he described the ups and downs of his most recent startup, reflecting on the mistakes made along the way and questioning his decision to work with academics during the process.
Finally, Thomas Stone presented “Venture Rounds, NDAs, and Toolkits” in which he described applying recommender systems to the field of venture finance, discussed the nightmare experiences with NDAs during his PhD, and offered an introduction to PredictionIO, an open source machine learning server.