In the era of big data, data-driven methods have become increasingly popular in various applications, such as image recognition, traffic signal control, fake news detection. The superior performance of these data-driven approaches relies on large-scale labeled training data, which are probably inaccessible in real-world applications, i.e., "small (labeled) data" challenge. Examples include predicting emergent events in a city, detecting emerging fake news, and forecasting the progression of conditions for rare diseases. In most scenarios, people care about these small data cases most and thus improving the learning effectiveness of machine learning algorithms with small labeled data has been a popular research topic. In this tutorial, we will review the trending state-of-the-art machine learning techniques for learning with small (labeled) data. These techniques are organized from two aspects: (1) providing a comprehensive review of recent studies about knowledge generalization, transfer, and sharing, where transfer learning, multi-task learning, and meta-learning are discussed. Particularly, we will focus more on meta-learning, which improves the model generalization ability and has been proven to be an effective approach recently; (2) introducing the cutting-edge techniques which focus on incorporating domain knowledge into machine learning models. Different from model-based knowledge transfer techniques, in real-world applications, domain knowledge (e.g., physical laws) provides us with a new angle to deal with the small data challenge. Specifically, domain knowledge can be used to optimize learning strategies and/or guide the model design. In data mining field, we believe that learning with small data is a trending topic with important social impact, which will attract both researchers and practitioners from academia and industry.
|Original language||English (US)|
|Title of host publication||KDD 2020 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining|
|Publisher||Association for Computing Machinery|
|Number of pages||2|
|State||Published - Aug 23 2020|
|Event||26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020 - Virtual, Online, United States|
Duration: Aug 23 2020 → Aug 27 2020
|Name||Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining|
|Conference||26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020|
|Period||8/23/20 → 8/27/20|
Bibliographical noteFunding Information:
The work was supported in part by NSF awards #1652525, #1618448 and #1934721. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing any funding agencies.
© 2020 Owner/Author.
- domain knowledge
- knowledge transfer