Airbnb 数据科学职位
[文书-求改] 16fall EE申PhD PS初稿求狠批

xsh6528 发表于 2015-11-3 20:10:54


Having been doing research on online learning and differential privacy for more than one year in my undergraduate life, I find great interest and self-accomplishment in this area. Besidesonline learning and differential privacy are very practical tools for mining essential information from big data and privacy issue in social media. Successful research on those useful techniques can bring great profit for both industries and individuals. Consequently, I have a strong desire to go further in this area through my graduate study. Also, graduate study in this area is necessary for my long-term goal in pursuing a career of teaching and research. I have fully prepared myself for my graduate study with solid independent research training, high intelligence and creativity to deal with hard problems and excellent course foundation.

Solid research training in SINCLab for nearly one year has equipped me with strong ability to master new technique and put them into application. Also, this research experience offered me a firm background of online learning. (At the beginning of my junior year, I made acquaintance with prof. Pan Zhou, my research advisor, with great fortune. Then I started to work as research assistant in his team focusing on topic of online learning with great passion.) My ability to absorb new knowledge really get stronger during this period. At the beginning felt really dizzy and restless faced with newly technical paper. Though difficult, I calmed down, buried myself into papers and think carefully about every words, define and equations. Much to my delight, I reported my new acquired technique in weekly meeting and finally get proficient in Contextual Multi-armed bandit (C-MAB). In detail, C-MAB is a classic area of online learning, where learners adjust their selection strategy by observing their rewards of past actions. Latter, I showed excellent application capability by applying C-MAB model into recommendation system. As research continued, after reading articled about recommendation systems and social networks, I fond C-MAB was quite appropriate for recommendation in social networks that the selecting actions each round in CMAB was similar as selecting items to users in recommendation. In addition, learners can leverage rich user-generated contextual information in social networks to improve their learning ability. Unfortunately, I met a new problem in this idea that how to deal with the user-generated social big data? Storing and Updating the expected rewards of C-MAB in this big data context is extreme challenge that there is not enough storage space to record past reward for each user. After doing survey on big data analytics, I proposed a cloud based recommendation system where servers were modeled as decentralized learners and they work cooperatively through communicate over the links. Besides, I acquired and applied a new technique adaptive partition into my model that we classify users into groups and only record rewards as group. Successfully, I composed and submitted a technique paper to report this work.
I demonstrated my high intelligence and creativity through successfully introducing differential privacy into distributed C-MAB. As I went further in online leaning area, I realized that there exit two privacy problems in this recommendation process: 1) users sensitive attribute can be exposed by the selected actions and 2) observed awards can leak out information about actions. With this regard, traditional encryption and anonymity requires complex computation which can not fit into big data context at all. Thus, I implement differential privacy, a very novel tool, which informally means the privacy will not be deduced by the outcomes if two inputs are similar. To be specific, for users privacy, learners randomly select actions according to computed probabilities. I finally proofed that this randomness not only covered the privacy of users, but also the learner can get approximate optimal revenue. Fortunately, another big challenge come out with the learners privacy issue concerned. As the reward are directly determined by the selected action, randomly selection cannot solve this problem at all. At this point, I change my view on this point, why cannot I add randomness directly into rewards--sacrificing a little learning rate but get privacy protection in return? But the way to success is not easy as I imagined at all. Laplace mechanism is for static database, but rewards on CMAB is continued observed. After reviewing many papers with my problem, I found a efficient method to deal with this issue-- a tree-based aggregation where I can put the timely produced reward into the leaf nodes of constructed binary tree. Then, I applied myself proposed model and algorithms into social network advertising and have submitted my paper to report the work. Beyond that, Considering the sparsity of big data, I originally proposed a novel geometric differential privacy method which can extensively decrease learning regret but guarantee the privacy preservation.

Typical graduate courses laid me a solid foundation for my graduate study for online learning. Excellent performance in math courses like “Probability and Mathematics Statistics”, “Linear Algebra”, “Stochastic Process” put me in a favorable position to make complicated theoretical analysis. In addition, course such like “Data Mining” has broaden my knowledge about the applications in online learning. Moreover, to catch up the new knowledge, I learned important chapters in Convex Optimization” like “Duality”, “Convex optimization problems” and “Descent methods”.

Three-year undergraduate education and one-year independent research experience have fully prepared me to pursue graduate study. I expected to go further into online learning area through graduate study. For all I know, techniques in online learning are not limited to MAB. Besides, There also exist plenty algorithms for MAB like UCB, Exp3 and so on. Also, extracting new theorem to deal with big data analytics and privacy issues attracted me deeply.

主要困惑的是1 研究经历要不要写这么详细呢? 2 一直在强调online learning,到底要不要把自己的领域说这么详细呢?还是直接说AI 或者ML呢?



 xsh6528 发表于 2015-11-3 20:11:34
cassie_huang 发表于 2015-11-5 00:12:14
我也在写PS,改了好几遍了总不满意,我觉得吧online learning还是太细了,第一段和最后一段的方向写machine learning就好了,不过中间可以提到自己倾向于online learning和optimization
 xsh6528 发表于 2015-11-5 13:32:06
cassie_huang 发表于 2015-11-5 00:12
我也在写PS,改了好几遍了总不满意,我觉得吧online learning还是太细了,第一段和最后一段的方向写machine ...
很nice的建议!我也觉得太细了,拿我还是写 machine learning。
mzry 发表于 2015-11-7 22:30:12
城门白昼 发表于 2015-11-7 22:58:50
I find great interest and self-accomplishment in this area 具体说一下是什么让你感兴趣和感到成就感的.
My ability to absorb new knowledge really get stronger during this period. 感觉可以的具体阐释一下,具体怎么stronger的。

