tiki_blog

Tiki Boosts Revenues and Increases Customer Engagement by Partnering With Gravity

We have recently partnered with Tiki.vn, one of the most prominent players of Vietnam’s booming eCommerce sector. During the initial testing period, our solution significantly outperformed competitors.

Today’s well-informed consumers tend to favor sites with the lowest prices and are rarely loyal to any particular eCommerce brand. Vietnam has a vibrant marketing scene, but most companies are focusing their marketing efforts on new customer acquisition. When Tiki decided to look for a recommendation solution provider, their goal was to prioritize user engagement and generate more revenues by providing a personalized experience. Gravity’s solution has proven to be highly effective in facilitating the upselling and cross-selling of Tiki products and improving customer retention.

During the pilot period, Gravity’s recommendations resulted in an additional $13.15 GMV per 1000 recommendations and a 6% average conversion rate.

“Our cooperation with Tiki is a major milestone in the company’s expansion in the SEA region. We’ve recently opened our office in Vietnam and hired Ngô Kỳ Lam, an experienced e-commerce specialist as our Country Manager in order to establish fruitful partnerships, and provide maximum support to our present and future clients in the region.” – Marton Vertes – Business Development Manager, Gravity R&D

“Gravity has been outperforming other similar solutions in every important aspect and KPI: Revenue, AOV, CTR and response time. This is why, after evaluating the results, we have decided to move forward with Gravity as our chosen recommender system provider.” – Hung Tran Viet – Product Manager, Tiki

Tiki.vn is one of the largest eCommerce sites in Vietnam, a country that is showing enormous growth in the online retail, and IT sectors in general. Similar to Amazon, Tiki started out by selling books online. Through the years, they’ve significantly expanded the scope of their operations. Currently, there are over 300,000 listings on the site in over twelve product categories and their traffic is growing substantially month over month.

Gravity offers scalable solutions for Enterprise clients with unique needs and larger traffic, as well as for small and medium-sized businesses through its turnkey SaaS recommendation engine, Yusp (www.yusp.com).

Read More

The logo of Yusp, the SME Recommendation as a Service product of Gravity R&D.

The Yusp Open Beta kicks off!

We recently launched our brand new SME solution called Yusp which is an easy to administer, SaaS based, out-of-the-box recommendation engine.

By installing Yusp, you opt in to harness the analytic power of Gravity’s proprietary, patented recommender algorithms matching in sophistication to those used by Amazon and other industry top players.

As a part of our Open Beta program, we provide free installation and configuration support to all new registrants, in addition to the 30-day free trial!

What does Yusp do?

Recommendations served by Yusp run through the same servers and are generated by the exact same algorithms as our enterprise clients’. Our algorithms have proven, immense capabilities to drive sales, increase conversions and significantly boost customer satisfaction and engagement on your site.

Even better, to get Yusp to work you need not perform any technical tour de force. Simply paste a code snippet in the header section of your site, alongside your Google Analytics tracking code and everything else is done over our super-user-friendly graphic interface.

Import your stock automatically

Simply paste the URL of one of your product pages in the appropriate field and Yusp automatically scans and imports the items in your stock.

A screenshot from the Dashboard interface of the Yusp recommendation engine.
Importing your product catalog into Yusp is fast and intuitive.

Tracking users

Our system tracks and records every detail of how users interact with your site and enriches this data with contextual parameters such as location, time, device, and referrer.

Personalized product recommendations

Building on these insights, our system fills the multi-device-friendly recommender boxes on your site with the items most relevant to each user in each context.

Drive sales and improve user experience

By showing the right product at the right time to the right users, Yusp can significantly improve your sales metrics and conversion rates by creating personalized user journeys for each and every one of your visitors to maximize customer engagement. Moreover, you can track the performance of the recommendations through our analytics dashboard.

A screenshot from the Dashboard interface of the Yusp recommendation engine.
You can easily track the performance of your recommendations through our analytics Dashboard.

 

Still skeptical? Don’t take our word for it, sign up for the Yusp Open Beta program and see for yourself. On top of the 30-day free trial, you get free, dedicated installation and configuration support so you can maximize the benefits Yusp brings to your business!

Yusp free trial banner.

Read More

The logo of eMAG.

Gravity R&D: first year of collaboration with eMAG, a major success

Gravity R&D, one of the leading international technology experts serving omnichannel recommendations completed the first year of collaboration with eMAG with positive results, such as a 30% increase in conversion rates and additional revenues of 9 EUR / 1000 recommendations.

“We have great confidence in the effectiveness of our technology and our commitment to delivering results according to expectations. When we decided to enter the Romanian market, the most logical step was to test our solution with the biggest local eCommerce player. Our cooperation with eMAG confirms our ambitions and we are planning to expand further on the local market” says Serban Manea, Business Development Manager, Gravity R&D Romania.

Gravity R&D is currently an extension of eMAG’s in-house web development team. Also, it is the only provider active in Romania that offers a complete range of recommendation services to its partners, from on-site recommendation boxes with relevant products for each user, personalized retargeting (conversion and retention of users based on ad displays) and newsletter personalization to smart search features.

eMAG uses Gravity R&D’s technology for personalized recommendations both for desktop and mobile users. After the testing period, the integration has been extended across all countries where eMAG is currently active: Bulgaria, Hungary, and Poland.

Gravity R&D has extensive international experience, working with companies such as Dailymotion.com, Allegro Group (allegro.pl), Schibsted Group (mudah.my – Malaysia, jofogas.hu – Hungary, avito.ma – Morocco), OLX Brazil and Jora.com.


The logo of Gravity R&D.

Gravity R&D is one of the world leaders in machine learning and recommendation engines. In 2009, they tied for first place in the Netflix Prize competition out of 18,000 competing teams from over 150 countries. This was the largest and most comprehensive machine learning competition ever held. Their team represents a collective of some of the best data mining and machine-learning scientists in the world.

The company offers scalable solutions both for big e-commerce sites with complex integration needs, as well as for medium and small businesses through its automated RECOplatform solution.

Gravity R&D delivers 5 billion monthly recommendations, generating additional revenues of over EUR 40 million for its partners. Recently, Gravity R&D has been part of 5 major A/B testing projects, outperforming their competitors on tenders with companies such as Dailymotion, Allegro (Naspers Group) or OLX Brazil.

The logo of eMag.

Founded in 2001 by Romanian entrepreneurs, eMAG is a pioneer of Romania’s e-commerce landscape and has become a regional leader, being present in Bulgaria, Hungary and Poland. eMAG has constant investments in services that help people save time and money, relying mostly on Romanian tech intelligence.

With an ever-increasing offer, both through their own portfolio of products and Marketplace partners, eMAG is the place where anyone can search and order anything from anywhere. The clients are welcomed with value-added services, as 30 days return period, package opening at delivery, 24/7 call center, Service Pick Up and Return, installments financing through eCredit, mobile app and „One-click pay” instant ordering service through a swipe on the phone.

Currently eMAG features over 500.000 listings, from consumer electronics, mobile phones, and books to children’s toys, home appliances, garden tools, automotive parts and sports equipment.

Read More

Pilászy Istvan at Gravity R&D

Reflections on RecSys 2015

As is the norm in each yearly RecSys conference, there were several strong papers. But I found that while many of the papers were impressive from an engineering and practical use standpoint, they did not fare as well from a scientific research standpoint. More specifically, I’m singling out articles that simply take an area of research, add in some additional data sources, and then integrate this additional data into existing formulas to elicit some improved accuracy.

 

We’ve come quite far since the days of RMSE

When the Netflix Prize Competition was still active from 2006 to 2009, there was just one massive dataset (of 100 million ratings) and one target: the root-mean-square error (RMSE). During that time, the research was focused and papers were very comparable to each other. We’ve since come a long way from papers published around the years of the Netflix Prize, it has been determined that algorithms have varying levels of effectiveness depending on which dataset it’s being used against. It turns out that RMSE isn’t a good choice when your purpose is to generate relevant recommendations.

At Gravity, we found that using RMSE is not effective back in 2009 when we were building a public demo based on the Netflix Prize datasheet. Further explanation about how we reached this conclusion can be found in the first section of our RecSys presentation: Neighbor methods vs matrix factorization – case studies of real-life recommendations.

These days, there are plenty of datasets, and many different evaluation metrics are also available. To further drive the complexity of the current state of the RecSys community, researchers often add an additional data source to create even more complex algorithms. Over time, research topics are becoming more diverse, and research papers are no longer comparable.

For Gravity’s customers, item-to-item recommendations (people who viewed this item also viewed) are in higher demand than personalized recommendations. However, it’s really hard to find papers on the topic of item-to-item recommendation.

However, there was one paper from this year’s conference that I did find of interest and which I will explain below.

 

Top-N Recommendation for Shared Accounts

A paper that stood out to me this year, and that I would identify as my favorite would be from Koen Verstrepen and Bart Goethals: Top-N Recommendation for Shared Accounts. I observed their approach in this paper to be the following:

  • Consider a user who viewed Nitems.
  • Use a typical item-neighbor method to assign a score to each recommendable item based on what the user has viewed previously.
  • Create 2N-1 temporary users, each with a different subset of the original user’s viewing history.
  • Generate prediction scores for each of those temporary users, using the item-neighbor method.
  • For each temporary user, divide the scores by the temporary user’s history length, or a power of that number (e.g. take the square root of temporary user’s history length).
  • When calculating the prediction for item i for the original user, take the maximum score for item i over each temporary user, that will be the final score for item i for the original user.
  • Order items by the computed prediction scores

They show that this can be done in O(Nlog(N))time instead of O(2N). This approach (taking the maximum score over each temporary user) has another nice property: it can provide explanations, i.e. the root cause why item i was recommended to the original user. Consider for example, that for item i, the maximum score was generated by a temporary user who viewed items i1, i2 and i3. Then for item i, the recommender algorithm can say that it was recommended because the original user viewed items i1, i2 and i3.

This paper was really interesting because it focused on algorithmic methods, featured a simple yet fast solution, and they show how this method helps when multiple users are using the same account (e.g. a household watching TV), without knowing the number of persons in the household or knowing which person of the household viewed which item. They also propose an elegant way to generate diverse recommendations:

 

  • First, take the highest scored item. It will also have some explanatory items (see above)
  • Second, take the highest scored item from the rest, but consider only those items that have at least one explanatory item that is not amongst the highest scored item’s explanatory items
  • Third, take the highest scored item from the rest, but consider only those items that have at least one explanatory item that is not amongst the above items’ explanatory items
  • And so on

 

They also show that their method’s accuracy is comparable to the original neighbor method which they operate on, and is also capable of giving good recommendations when multiple people are sharing the same account. In my opinion, this method is a nice way to give users recommendations with the following 3 properties: diverse, accurate and easily explainable, all at the same time. I really enjoyed this paper as it was able to provide new enhancements to a method-familiy (item-based neighbor methods) that has been studied for so many years.

 

Closing thoughts

This year’s RecSys was a well organized conference, there were some really strong papers, as usual. But overall, I felt that it lacked the spirit of the old years, when every conference would bring about the announcement of several new breakthroughs in research. There used to be plenty of algorithmic papers every year, and everybody was always curious how research would develop into the future. Now that we already have all those breakthroughs, this area is maturing, and it’s become difficult to make big discoveries. This year’s many papers containing engineering work also indicates the less research oriented direction the conference is now taking.

 

In the future, I’d like to see more emphasis and research placed on the following topics:

 

  • Correlating offline and online measures (e.g. Recall vs. CTR). There was a paper this year about this topic, hopefully there will be many more papers in the upcoming years.
  • Correlating short-term and long-term online measures (e.g. predicting long-term site-income increase from short-term CTR increase). Simple example: if you make a customer buy twice as much water as usual, then this customer may skip buying water next time.
  • Item-2-item recommendations: this is a frequent topic in need of more research.
  • Matrix factorization methods that deal with really large and sparse matrices (e.g. 50M items x 100M user, 3 events per user). The problem here is that you have to increase the number of latent factors, otherwise totally unrelated items might become similar.
  • Content-based filtering methods that are able to find the most relevant items in real-time, even when there are 200M items. Currently, there are approximate solutions (e.g. Locality-Sensitive Hashing), which provide a trade-off between accuracy and running time, but if you need good accuracy, you would be better off running the naive approach.
  • AutoML is an interesting new direction, i.e. instead of manually choosing the best algorithms and manually tuning the hyperparameters, the aim is to have this process done automatically. Perhaps the RecSys community should make some step in this direction, e.g. a RecSys challenge with 20 different RecSys problems simultaneously would be something new and challenging.

 

Here’s to wishing for more breakthroughs and excitement in the future of the RecSys community!

István Pilászy is the Head of Core Development as well as one of the founders at Gravity R&D.

Read More

Insights and observations from RecSys 2015

Insights and observations from RecSys 2015

by Balázs Hidasi

 

Overall I enjoyed this year’s RecSys conference. It was well organized and nice to see what other researchers and people from the industry were up to. However, I was somewhat disappointed to see that the quality and amount of research in this field has slowed down significantly. In my opinion there were only a handful of long papers where the core idea was novel and exciting. (See my top picks below.) I don’t know the exact reason behind this shift and I doubt anyone does. However I find it interesting to speculate on this. Looking at the big picture this slowdown might not be that surprising. Here are my thoughts on this year’s conference..

recsys

Evaluation and goals of a recommender system in research and in practice

In the last five or so years, recommender systems research has moved closer and closer to practical systems. Fortunately, the days of rating prediction are pretty much over and the majority of work focuses on the realistic scenario of top-N recommendations. You can also see other signs of this shift, e.g. more papers working with implicit feedback and/or using online evaluations and so on. This is generally a good thing, because it makes the transition of novel methods from research to industry faster. However there is a huge problem with recommender systems in practice: evaluation. In his keynote speech, Igor Perisic talked about “making delight” being the goal of these systems and products. I fully agree with the notion that the final goal of a recommender system is to make its users’ lives easier, help them with their problems (related to finding what they need), and generally make using the system a good experience for them. But from a research point of view you can’t evaluate methods with respect to “delight”. You can try to approximate it through several steps by using different online metrics. However metrics that are good for A/B testing – such as CTR – are not approximating the final goal well. And offline tests are approximations of the online performance, thus they add an additional approximation step. Still their value can’t be discarded as they are useful for prefiltering methods. And from the research point of view offline tests are exact and repeatable: it is clear which algorithm performs better by a concrete metric and running the same test three months later will give the exact same results. Long story short, as recommender systems research transitions towards industry, researchers find that they can’t evaluate their methods in a way that is very meaningful in practice. Therefore majority of practitioners take the reported performance of novel algorithms with a pinch (or lots) of salt and still use very basic methods. This is disheartening, and slows down the progress of research.

 

Exhausted research topics

Currently popular topics are generally well researched. The same topics have been popular for the last decade. For example: factorization methods, context-awareness, cold-start avoidance, hybrid algorithms, and etcetera, have been around for a while now. Even though the appearance of implicit feedback in research has spiced things up, that itself has become somewhat exhausted. This has naturally caused a slowdown, because additional research can only add a small epsilon to already existing solutions. I think that the community is waiting for the next big thing, something that is fundamentally different and shakes things up. This new area however must deal with a problem that is important in practice and be algorithmically challenging and interesting to researchers. I think that some researchers already have a candidate that could qualify for this. You could hear whispers among the crowd here and there; as well as several researchers I talked with mentioned a certain topic with which they will start working on shortly. 🙂 If we think about it optimistically, perhaps this year was just the calm before the storm and the next few years might be the most exciting period of recommender systems research yet.

 

The lack of an industry track

There may also have been a conference specific reason for the low output of exciting research papers at this year’s RecSys conference. RecSys is traditionally a conference for academia and industry; for research and application. However papers can only be submitted for a research track. Purely application related presentations are generally in the (invitation based) industry sessions. To my surprise, there were several papers in this year’s research track that I would describe as high quality engineering work. This type of work combines ideas from previous years’ research as components of a system to provide recommendations in a specific scenario of a specific domain. The technical quality of these papers is generally high, but the novelty for research is negligible. I think these papers have a place in a conference like RecSys but I don’t agree with including them in the research track. Did these papers take places from actual research papers? I’m not sure, maybe they were, or maybe it is the other way around; maybe there weren’t enough high quality research papers, so the remaining slots were filled high quality engineering works. Whatever the case may be, I think that the conference would benefit from having a separate industry track for papers of the engineering kind.

 

I don’t think that there is a single reason behind the slowdown of research. I think all three of the aforementioned theories are correct to some extent. They – as well as other factors I haven’t considered – cumulatively caused this phenomenon.

Best paper picks of RecSys 2015

Despite my complaints, there were several papers at RecSys 2015 that I enjoyed. The following list contains my top picks from the main conference (long and short papers) with some justifications for why I liked them. There are no workshop papers on the list, because I haven’t fully processed those. I restricted myself to select only three papers, the ones that I’ve found the most exciting. There were other papers and ideas I liked, but these were the most interesting ones for me. The items of the list are presented without any particular ordering.

 

Gaussian Ranking by Matrix Factorization (by Harald Steck)

Harald Steck, Netflix Inc.

The paper proposes a framework for directly optimizing for ranking metrics, such as AUC and NDCG. While methods optimizing for certain ranking metrics were proposed before, this framework is much more general and promises to be able to handle every metric as long it is differentiable with respect to the ranks of the items. This includes most of the popular metrics, such as NDCG, AUC, MRR, etc. (but unfortunately not recall@N). The key to the framework is the link between the scores and the ranks which makes the ranking loss differentiable with respect to the model parameters. The whole idea is very elegant and has additional potential beyond the scope of the paper. (Also, bonus points for using the NSVD1 model, even if it is referred to as AMF. NSVD1 is a classic method, that is unjustly forgotten, yet I review at least one paper a year that tries to reinvent it.)

 

Dynamic Poisson Factorization (by Laurent Charlin et. al.)

I would have probably missed the original paper on Poisson factorization have it not been for this presentation at RecSys 2015. That would have been a shame, because the base algorithm is very interesting. It seems to be a better fit for implicit feedback than Gaussian factorization methods (or their frequentist counterparts that optimize for the sum of squared errors). It also has a few additional nice properties. The algorithm presented at RecSys builds on this novel factorization, introduces evolving user and item feature vectors and puts them into the mix. This is an answer to a practical problem: the taste of users and the audience of items change and we should model it somehow. The only thing I miss from the paper is a comparison with some kind of event decay function supported factorization method. (Bonus points for the clear and comprehensible style of the paper. Due to the complexity of the algorithm this wasn’t easy to convey in such a simple way by any means.)

 

Predicting Online Performance of News Recommender Systems Through Richer Evaluation Metrics (by Andrii Maksai et. al.)

It has been an important and yet unanswered question how offline metrics relate to online KPIs. While several papers suggested that optimizing for (ranking based) accuracy might not be the best course of action, there was no clear alternative. What trade-off is good between accuracy and diversity? Will a 5% percent increase in recall@N translate to a noticeable increase in CTR? These were the questions to which nobody had an exact answer, but if you’ve worked enough with recommenders you know a few rules of thumb. This paper might solve this issue as it proposes to build a predictive model from 17 offline metrics to estimate CTR. Using this model they present interesting findings, e.g. in news recommendation it seems that diversity and serendipity is a little bit more important than accuracy. I think that the validity of the proposed approach depends entirely on how accurate the CTR prediction is over time. At first glance the results are convincing, but it is alarming that the constant CTR prediction also has a low error. Nonetheless, I think this is an interesting direction that is worth exploring on additional domains. (Bonus points for pointing out that different metrics of a type (e.g. accuracy metrics) have high correlation with each other. Maybe reviewers in the future won’t mind if you don’t use their favorite accuracy metric.)

 

 

In closing..

This year’s RecSys conference left me with feelings of ambivalence. Although there were a handful of papers containing substantial research and contributions to the field of RecSys, the overall program felt lacking. The conference was well organized, but not as strong as it has been in the last few years. In the future, I hope to see the rise of one or more completely novel topics to shake up the field and make things within the RecSys research community exciting again. Lastly, the presentation of research and engineering papers side by side this year felt unnatural, and I believe that dividing these topics into separate tracks will benefit the conference in a big way.

 

About

Balázs Hidasi is leader of the data science team, and is responsible for research and data mining activities in Gravity. He coordinates the team and also conduct his own research in the field of machine learning and data mining. His research revolves around (1) developing advanced recommender algorithms to make Gravity’s recommender engine even better; and (2) exploring new fields and application areas for recommender systems. Balázs also coordinates and consults for data mining projects (e.g. data analysis, POCs) within the company. He also has a blog.

Read More