Industry Leaders in Signal Processing and Machine Learning: Luna Dong

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

Industry Leaders in Signal Processing and Machine Learning: Luna Dong

Hamid Palangi

Luna DongDr. Luna Dong is ACM Distinguished Member for her contributions on knowledge integration and knowledge fusion, and the recipient of the VLDB Early Career Research Contribution Award for "Advancing the state of the art of knowledge fusion". The Knowledge-based Trust project she led at Google was called the “Google Truth Machine” by Washington’s Post. She serves in the VLDB endowment and the PVLDB advisory committee, and is a PC co-chair for WSDM'2022, VLDB'2021, and Sigmod’2018.



We approached Luna Dong with a few questions:

1. In your own words, please tell us about your background.

I just joined Facebook AR/VR Assistant leading science efforts there to power AR/VR devices to be more intelligent. Prior to joining Facebook, I was a Senior Principal Scientist at Amazon leading the efforts to build Amazon Product Knowledge Graph since 2016, and before that, a Research Scientist at Google, working on Google knowledge graph and the Knowledge Vault project. I received my Ph.D. in Computer Science at Univ. of Washington

I have been spending the past 10 years or so building the world's largest knowledge graphs, including generic knowledge graphs such as Google Knowledge Graph, and domain knowledge graphs such as Amazon Product Graph. To construct comprehensive knowledge graphs and put them to real applications, I have been conducting research on information extraction, data integration, data cleaning, graph mining and embedding, and knowledge-based search and recommendation. As a scientist in industry, I have been leading end-to-end science innovation cycles including identifying customer needs, inventing state-of-the-art ML techniques, implementing and deploying production features, and launching new customer experiences.

2. What challenges have you had to face to get to where you are today? 

As a scientist in industry, the biggest challenge I had to face is how to balance science innovations and product impact. Indeed, the beauty of doing science in industry is the opportunities to make real impact directly from the science innovations. However, the impact requires efforts far beyond writing a paper or a series of papers: high-quality performance needs to be achieved to create good customer experiences; the techniques need to be generalized and scaled up to enable broader applications; solid production pipelines need to be developed and deployed to make the process repeatable; and user-friendly product features need to be carefully designed to make users and customers really benefit from the brilliant research ideas underlying it. There wouldn’t be much of a real impact without any of these, and to achieve it, it requires scientists to develop a good taste for innovation direction, to be practical to make things happen, and to collaborate closely with other people with different roles and expertise. I had successes as well as lessons in making such balance, and I’m still learning.

3. What was the most important factor in your success?  

That is learning; never stop learning. I came from the Database background and did my PhD on data integration; that is, how to seamlessly integrate data from different data sources with different formats and oftentimes different semantic meanings. This is clearly relevant to building a comprehensive knowledge graph, which requires pulling data from everywhere and aligning them together. But data integration itself is far from enough--data can be in the unstructured text form so need to be extracted, requiring NLP; data can be in images so need to be extracted or inferred, so requiring Computer Vision; data can be noisy so need to be cleaned, so requiring anomaly detection and data cleaning techniques in Database and Data Mining; patterns need to be extracted from the factual data, so requiring knowledge embedding and mining in Data mining. I haven’t mentioned applications such as search, question answering, recommendation, so spreading across fields like Search, NLP, Data Mining, and ML. I can say no to these many fields far away from where I’m from, or I can learn to embrace all these new opportunities. I chose the latter. The learnings allowed me to start publishing and earning visibility in Data Mining, NLP, and Search; more importantly, the learnings gave me the capability to really collect knowledge from everywhere and use it to best serve people.

It might be intimidating to think about learning totally different fields. I have two tips that help me step out of my comfort zone but still stay fairly comfortable. First, learn a little bit each time, but do so constantly. In the year of 2017, I spent 30-45 minutes reading papers every day (possibly longer time over weekends). At the end of the year I read 200 papers, and feel quite accomplished. Second, get the amount of learning right. When I can choose (a job, a project, a new position), I choose things where 85% rely on my existing knowledge and skills, and 15% require learning. In such a way, I won’t be bored and would be motivated to get out of my comfort zone, and meanwhile, I won’t be too challenged and stressed out.

My career has been a process of going deep in an area, then broadening to a bigger area, then going deep in the bigger area, and then broadening again, etc. In such a way, I started from data integration (of structured data), next to data quality (and started the knowledge fusion research area), then to knowledge integration and fusion, later to knowledge collection and application, and now to search/QA/recommendation using all available information. Not only do I try to learn different fields in science, I learn engineering, management, product design, economy, psychology; I learn everything that can take me closer to my dream.

Only if we learn like a sponge, we can grow like a sprout.

4. How does your work affect society? 

One difference between human beings and animals is that we pass knowledge generation by generation through language and media (e.g., books), rather than pass instinct through genes. Sheeps know who are their enemies and when they are in danger at birth; this instinct is coded in their genes through many generations in a hard way, costing many lives. Human beings have their languages and use languages to teach their descendants complex knowledge, thus human knowledge accumulates and grows exponentially. However, such knowledge is scattered at different places, in different languages, and on different media (texts, images, video, etc.). What limits our innovation in the modern age is less of inadequate knowledge, but more of the lack of capability to synthesize all scattered information and knowledge and make the best use of it. 

My work is to make computers more knowledgeable, and ultimately to serve people the right information at the right time, to make the world better with happier and more effective people. Building comprehensive knowledge graphs is certainly an important part of it, as it puts knowledge on our fingertips. It is also important to decide when it is better to collect knowledge and persist it in a knowledge graph, when it is better to quickly access some information for instant needs, and when we shall discard out-of-date information; this is because not all information is knowledge and we need to choose. Finally, it is super important to decide what knowledge or information is relevant to the time, place, and context, and serve knowledge selectively, so we are the master of knowledge, rather than its slaves. There is a long way to go and a lot of research needed.

5. If there is one take home message you would want the readers of this interview to take away, what would that be? 

I always remember a question one of my mentors asked me: 30 years later, when you look back at the most important innovations, are you a part of it? This is attributed to Guha, Google Fellow ( Since I heard that question, I’ve been viewing my research and my career in a different way. Is my research leading to innovations that will change the lives of human kind? What is the technology trend and how can I leverage that or align better? How to bring my science inventions to people’s lives and make it meaningful? How to generalize and scale up my inventions for broader impact? How can my innovations inspire other people in their fields and their innovations? 

If there is one take home message for the readers, I’d like to invite everyone to think of Guha’s question and to become the driving force for innovations that make the world a better place to live.

6. Failures are an inevitable part of everyone’s career journey. What is the most important lesson you have learned during your career when dealing with failures?

Working backwards is a big lesson I learned from failures in my early career. I’m grateful that I was taught this lesson, honestly, in a tough way, so I was able to avoid many detours later.

I did a project at Google with great academic success--we published a paper that accumulated over 1500 citations within 8 years and inspired a lot of research in the field. But the same project was a production failure--we couldn’t launch it anywhere at Google, and eventually it was shut down. Why? There are many reasons and an important reason is that it does not successfully address any real pain point. We did the project because we thought it was a cool idea; we measured success using a metric that we thought was a cool metric. Does it bring new features or better features that serve Google users better? We haven’t thought much about it.

It took reflections, sometimes years of reflections, to realize what might have gone wrong. I didn’t realize why this academic success got this sad production fate until I joined Amazon and learned the concept of working backwards. You need to first know the customer experiences you’ll provide, and then work backwards to figure out the techniques you need to invent, the systems you need to launch, and the infrastructure you need to build. It is very different from, if not totally opposite to, typical scientist thinking process--we envision new techniques, and then leave it to others to figure out its uses. Brilliant scientific ideas will sooner or later enable great impact. But if we want to accelerate this process, scientists shall, at a minimum, be proactive in thinking about the destiny of the journey to better guide our innovations.

7. Although novelty and innovation is the most important factor for technology advancement, yet when a researcher, scientist or engineer has a new idea there are a lot of push backs until they prove that the new idea actually works. What is your advice on how to handle them? Especially for the readers who are in the  early stages of their career.

This situation reminds me of a hiking experience I had with my kids--we were walking in a very long tunnel, for nearly an hour, to get out of it. There are dim lights, but the tunnel is still quite dark, damp, and unpleasant. Yes, you have the same feeling before your innovations see lights, and even worse, it feels much darker.

How to get out of the darkness? First, you need to know where the light is and where the destiny is; in other words, you need to know what success looks like and ideally that can be measured. In industry, it is often measured by a user-facing metric (e.g., conversion rate, number of active users, revenue). Second, you need to put some lights in the tunnel to make the place a bit more pleasant, and to give you and others confidence that you’re making progress instead of totally getting lost in the darkness. What are those lights? They are meaningful metrics and milestones you and others agree upon. For example, let’s say you are proposing some novel ideas to do QA in a context we never saw success before. Recall @ Precision=90% is a meaningful metric in this case. You started with achieving 90% precision (i.e., for every 10 questions you answer, 9 are correct so your answers are reasonably reliable); as you gradually increase the precision to meet the 90% bar, you know you’re closer and closer to the first big milestone. Now you have achieved a precision of 90% and hopefully you have a reasonable recall to launch this QA feature to nail down the first success (sometimes even 5% is reasonable for an initial launch), you still may not be able to justify your impact. That’s the time you gradually improve your recall, and see happier and happier users and hopefully higher and higher user-facing metrics. Even if people won’t be fully convinced until they see a high-enough user-facing metric, as far as they agree with you on the precision bar and the R@P metric, they’ll earn more and more confidence as they see your progress, and you’ll have more confident partners and less push-backs.

Certainly, only knowing your destiny of the tunnel and putting lights to light up your journey is not enough, you need to make progress. The “working backwards” philosophy as previously said applies here. 

8. Is there anything else you would like to add?

If there is one more thing to add, that is the relationship, the relationship between you and your teammates, between you and your collaborators, and between you and your customers.

As computer scientists, we know how to interact with machines, and recently how to make machines intelligent. Good computer scientists also often know how to stand out in competitions. But achieving something big requires collective efforts from a team of people, and oftentimes a big team of people, or even uniting the entire research community. Effectively working with or leading a team of people requires very different skill sets. A key part in the skill set is to build good relationships--you need to see the people behind the techniques and behind the products.

There are many things one can say regarding building good relationships with others. What I’d like to say to research scientists is that people are very different from machines. Instead of providing scripts to “program” your colleagues (or even students), you engage them, trust them, grow them, and care for them. Only when you make others succeed, can you achieve bigger success and higher impact. And more importantly, you’ll enjoy this relationship and your life and work will be more fun.

To learn more about Luna Dong and for more information, visit her webpage.





IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel