Insights and observations from RecSys 2015

Insights and observations from RecSys 2015

by Balázs Hidasi

 

Overall I enjoyed this year’s RecSys conference. It was well organized and nice to see what other researchers and people from the industry were up to. However, I was somewhat disappointed to see that the quality and amount of research in this field has slowed down significantly. In my opinion there were only a handful of long papers where the core idea was novel and exciting. (See my top picks below.) I don’t know the exact reason behind this shift and I doubt anyone does. However I find it interesting to speculate on this. Looking at the big picture this slowdown might not be that surprising. Here are my thoughts on this year’s conference..

recsys

Evaluation and goals of a recommender system in research and in practice

In the last five or so years, recommender systems research has moved closer and closer to practical systems. Fortunately, the days of rating prediction are pretty much over and the majority of work focuses on the realistic scenario of top-N recommendations. You can also see other signs of this shift, e.g. more papers working with implicit feedback and/or using online evaluations and so on. This is generally a good thing, because it makes the transition of novel methods from research to industry faster. However there is a huge problem with recommender systems in practice: evaluation. In his keynote speech, Igor Perisic talked about “making delight” being the goal of these systems and products. I fully agree with the notion that the final goal of a recommender system is to make its users’ lives easier, help them with their problems (related to finding what they need), and generally make using the system a good experience for them. But from a research point of view you can’t evaluate methods with respect to “delight”. You can try to approximate it through several steps by using different online metrics. However metrics that are good for A/B testing – such as CTR – are not approximating the final goal well. And offline tests are approximations of the online performance, thus they add an additional approximation step. Still their value can’t be discarded as they are useful for prefiltering methods. And from the research point of view offline tests are exact and repeatable: it is clear which algorithm performs better by a concrete metric and running the same test three months later will give the exact same results. Long story short, as recommender systems research transitions towards industry, researchers find that they can’t evaluate their methods in a way that is very meaningful in practice. Therefore majority of practitioners take the reported performance of novel algorithms with a pinch (or lots) of salt and still use very basic methods. This is disheartening, and slows down the progress of research.

 

Exhausted research topics

Currently popular topics are generally well researched. The same topics have been popular for the last decade. For example: factorization methods, context-awareness, cold-start avoidance, hybrid algorithms, and etcetera, have been around for a while now. Even though the appearance of implicit feedback in research has spiced things up, that itself has become somewhat exhausted. This has naturally caused a slowdown, because additional research can only add a small epsilon to already existing solutions. I think that the community is waiting for the next big thing, something that is fundamentally different and shakes things up. This new area however must deal with a problem that is important in practice and be algorithmically challenging and interesting to researchers. I think that some researchers already have a candidate that could qualify for this. You could hear whispers among the crowd here and there; as well as several researchers I talked with mentioned a certain topic with which they will start working on shortly. 🙂 If we think about it optimistically, perhaps this year was just the calm before the storm and the next few years might be the most exciting period of recommender systems research yet.

 

The lack of an industry track

There may also have been a conference specific reason for the low output of exciting research papers at this year’s RecSys conference. RecSys is traditionally a conference for academia and industry; for research and application. However papers can only be submitted for a research track. Purely application related presentations are generally in the (invitation based) industry sessions. To my surprise, there were several papers in this year’s research track that I would describe as high quality engineering work. This type of work combines ideas from previous years’ research as components of a system to provide recommendations in a specific scenario of a specific domain. The technical quality of these papers is generally high, but the novelty for research is negligible. I think these papers have a place in a conference like RecSys but I don’t agree with including them in the research track. Did these papers take places from actual research papers? I’m not sure, maybe they were, or maybe it is the other way around; maybe there weren’t enough high quality research papers, so the remaining slots were filled high quality engineering works. Whatever the case may be, I think that the conference would benefit from having a separate industry track for papers of the engineering kind.

 

I don’t think that there is a single reason behind the slowdown of research. I think all three of the aforementioned theories are correct to some extent. They – as well as other factors I haven’t considered – cumulatively caused this phenomenon.

Best paper picks of RecSys 2015

Despite my complaints, there were several papers at RecSys 2015 that I enjoyed. The following list contains my top picks from the main conference (long and short papers) with some justifications for why I liked them. There are no workshop papers on the list, because I haven’t fully processed those. I restricted myself to select only three papers, the ones that I’ve found the most exciting. There were other papers and ideas I liked, but these were the most interesting ones for me. The items of the list are presented without any particular ordering.

 

Gaussian Ranking by Matrix Factorization (by Harald Steck)

Harald Steck, Netflix Inc.

The paper proposes a framework for directly optimizing for ranking metrics, such as AUC and NDCG. While methods optimizing for certain ranking metrics were proposed before, this framework is much more general and promises to be able to handle every metric as long it is differentiable with respect to the ranks of the items. This includes most of the popular metrics, such as NDCG, AUC, MRR, etc. (but unfortunately not recall@N). The key to the framework is the link between the scores and the ranks which makes the ranking loss differentiable with respect to the model parameters. The whole idea is very elegant and has additional potential beyond the scope of the paper. (Also, bonus points for using the NSVD1 model, even if it is referred to as AMF. NSVD1 is a classic method, that is unjustly forgotten, yet I review at least one paper a year that tries to reinvent it.)

 

Dynamic Poisson Factorization (by Laurent Charlin et. al.)

I would have probably missed the original paper on Poisson factorization have it not been for this presentation at RecSys 2015. That would have been a shame, because the base algorithm is very interesting. It seems to be a better fit for implicit feedback than Gaussian factorization methods (or their frequentist counterparts that optimize for the sum of squared errors). It also has a few additional nice properties. The algorithm presented at RecSys builds on this novel factorization, introduces evolving user and item feature vectors and puts them into the mix. This is an answer to a practical problem: the taste of users and the audience of items change and we should model it somehow. The only thing I miss from the paper is a comparison with some kind of event decay function supported factorization method. (Bonus points for the clear and comprehensible style of the paper. Due to the complexity of the algorithm this wasn’t easy to convey in such a simple way by any means.)

 

Predicting Online Performance of News Recommender Systems Through Richer Evaluation Metrics (by Andrii Maksai et. al.)

It has been an important and yet unanswered question how offline metrics relate to online KPIs. While several papers suggested that optimizing for (ranking based) accuracy might not be the best course of action, there was no clear alternative. What trade-off is good between accuracy and diversity? Will a 5% percent increase in recall@N translate to a noticeable increase in CTR? These were the questions to which nobody had an exact answer, but if you’ve worked enough with recommenders you know a few rules of thumb. This paper might solve this issue as it proposes to build a predictive model from 17 offline metrics to estimate CTR. Using this model they present interesting findings, e.g. in news recommendation it seems that diversity and serendipity is a little bit more important than accuracy. I think that the validity of the proposed approach depends entirely on how accurate the CTR prediction is over time. At first glance the results are convincing, but it is alarming that the constant CTR prediction also has a low error. Nonetheless, I think this is an interesting direction that is worth exploring on additional domains. (Bonus points for pointing out that different metrics of a type (e.g. accuracy metrics) have high correlation with each other. Maybe reviewers in the future won’t mind if you don’t use their favorite accuracy metric.)

 

 

In closing..

This year’s RecSys conference left me with feelings of ambivalence. Although there were a handful of papers containing substantial research and contributions to the field of RecSys, the overall program felt lacking. The conference was well organized, but not as strong as it has been in the last few years. In the future, I hope to see the rise of one or more completely novel topics to shake up the field and make things within the RecSys research community exciting again. Lastly, the presentation of research and engineering papers side by side this year felt unnatural, and I believe that dividing these topics into separate tracks will benefit the conference in a big way.

 

About

Balázs Hidasi is the Head of Data Mining and Research in Gravity R&D. He is responsible for coordinating his team’s and conducting his own research on advanced recommender algorithms. His areas of expertise include deep learning, context-aware recommender systems, tensor- and matrix factorization. Balázs also coordinates and consults for data mining projects within the company. He has a PhD in computer science from the Budapest University of Technology.

Read More

Gravity selected as one of the fastest growing CE startups!

Deloitte has selected Gravity R&D to be a recipient of The Deloitte Technology Fast 50 in Central Europe award. Gravity was awarded as the 2nd fastest growing startup in Hungary, and the 25th fastest growing startup overall in the CE region. Napi.hu also covered this story.

deloitte

This is the 16th year that the competition has been held, where the 50 fastest growing public or private sector technology companies are ranked. Deloitte bases their decision upon financial statements and annual revenues. It’s awarded by the percentage of revenue growth as calculated over a 3 year period. Gravity grew by 466% during the selected period.

The CE Technology Fast 50 report is all about celebrating companies large and small, public and private, local and global that are delivering technological innovations which drive industry and business in our region forward. – Alstair Teare, CEO, Deloitte Central Europe

Receiving this award is a testament to just how quickly our business has expanded in the past several years. It draws attention to our innovative approach for business and potential for growth. And it comes at a good time. Our recent expansion into the hyper growth market of Southeast Asia that includes two new regional offices, several new hires, and a quickly growing portfolio of local companies, is sure to lead Gravity into several more years of outstanding growth.

Our clients are growing quickly as well. We’ve been working together with eMAG, the largest online store in Central and Eastern Europe, since 2014. When we started working together, their yearly revenue was €260 million. eMAG is now on track to reach revenues of €500 million this year, and has set likely targets to reach €1 billion by 2017. Our recommendations are improving the user experience across their site, helping people find relevant products quickly. Our cooperation with eMAG has now expanded to cover all of their country specific sites including Hungary, Poland, Romania, and Bulgaria.

It’s exciting to be recognized for all the hard work that’s been going on behind the scenes here at Gravity over the past several years. Receipt of this award places us into the spotlight and will fuel our team’s efforts moving forward to continue innovating the world’s best recommendation engine.

Read More

Berlin

Gravity at the ICMA conference in Berlin

The International Classified Media Association (ICMA) is a trade organization representing the top players in the classified industry. Last week, Gravity took part in a global conference organized by ICMA in Berlin (May 6-8). The main focus of the conference was the sharing economy, a new business concept brought about mainly by the emergence of social media. As Gravity is a leading personalization provider for the classified industry, we talked about the importance of personalization in today’s digital world and highlighted the impact that mobile has on the way people use the internet, including classifieds.  If you’re interested in our presentation, check it out here: http://prezi.com/4arc7lxvuaoo

Read More

Gravity R&D at the Budapest Startup Day

We’re delighted to announce that Gravity R&D has been chosen to take part in the Morgan Stanley Budapest Startup Day that will take place December 9th! Each year, 9 of the best Hungarian ICT startups are selected to present their solutions and expertise, which is an amazing chance to connect with Morgan Stanley’s executives, investors and other tech experts.

We’ll keep you updated about the event. Stay tuned!

Read More