注册一亩三分地论坛,查看更多干货!
您需要 登录 才可以下载或查看附件。没有帐号?注册账号
x
楼主是18年11月经refer进入FB DS的面试流程,历时3个多月的准备和面试,在19年2月HR电话通知onsite过,FB开始准备offer。然而等了3周以后依然卡在了immigration team。这期间经历了各种开心,等待,愤怒,不甘心,无奈,最终还是认命了,签了另外一家小公司开始申请新一年度的h1b。说来好笑,FB HR至今没有一个说法,immigration team一直卡着,要明确拒信也没有,罢了,就这样吧。.--
. .и
先说一下timeline,不知道有没有参考价值,感觉二面后一切都变的很缓慢:
18/11/20:HR 邮件联系-baidu 1point3acres
18/11/21:HR 电话,给了一面资料,安排电面
18/12/3: 一面. Waral dи,
18/12/3:下午HR说一面的product不尽人意,sql可以,安排二面
18/12/13: 二面
18/12/18: HR说二面过,开始安排onsite
18/12/27:转给了onsite负责人
HR中间要了两次immigration材料,确认OPT还有25+个月
19/1/14: 安排了19/1/25的onsite
19/2/15:onsite负责人打电话说过了,准备offer
然后中间用别的offer催了无数次,onsite负责人至今还在说immigration team缺人手,各个部门没有协调好,所以还是没个说法,作罢。
因为楼主12月20号回国休假,1月4号回来,所以脱产全职准备了两周多,再加上之前的两个月周末和工作日晚上都在认真看,onsite的四场面试我觉得都还比较顺利,没有没准备过的题目。虽然这段时间心情很差,今天还是决定给在准备的小伙伴们一个FB DS综合准备贴,感恩地里大家的分享。
因为没有正经做过FLAG的DS,所以理解如果有偏差欢迎大家指正。. .и
1 Overview. 1point3acres
FB DS面试我个人觉得像高考一样,都有标准答案,即便是product题目也是如此。几乎所有的题目都是题库,因此面试表现和准备花的时间完全正比。如果答不到点子上面试官会一直what's else what's else,让人非常frustrated. 先说一下题型:
FB的onsite分为四个部分,每部分30分钟结束。
.1point3acres- Analysis Case: Product Interpretation
- Analysis Case: Applied Data
- Quantitative Analysis
- Technical Analysis . Χ
(电面的时候通常45分钟,sql+product,即上面的1和4的组合).
这上面四个部分,1是对产品的理解, 2是产品和数据的结合,3是基本概率统计,4是sql。接下来对四个部分说一说我的理解。
2 Analysis Case: Product Interpretation
第一次电面之前我对这一块完全摸不着头脑,FB给的参考资料是https://medium.com/stellarpeers 看了关于FB的帖子以后感觉更晕了,不知道怎么能一个问题回答那么多。所以一面我的product就不太好。
然后开始看地里的面经,这一阶段我觉得写的最好的两个帖子是:
https://www.1point3acres.com/bbs ... D311%26sortid%3D311
https://www.1point3acres.com/bbs/thread-462895-1-1.html
这里面都提到了要clarify question, define metrics,回答问题要structured.
但是有了这个指导思想以后还不够,需要自己去练习,地里product面经非常丰富。只是通常都只有题目没有答案,要自己去思考。. 1point 3 acres
常见的面经题目:. 1point3acres.com
1) Best friend如何判断 https://www.1point3acres.com/bbs/thread-465021-1-1.html
2)加feature给market place. Χ
https://www.1point3acres.com/bbs/thread-465018-1-1.html
3)SPAM
https://www.1point3acres.com/bbs/forum.php?mod=viewthread&tid=446618&extra=page%3D5%26filter%3Dsortid%26sortid%3D311%26sortid%3D311
https://www.1point3acres.com/bbs/forum.php?mod=viewthread&tid=449091&extra=&page=1
4)产品health
https://www.1point3acres.com/bbs/forum.php?mod=viewthread&tid=373072&extra=&page=1
https://www.1point3acres.com/bbs/forum.php?mod=viewthread&tid=405396&extra=&page=1
5)父母加入FB
https://www.1point3acres.com/bbs/thread-282664-1-1.html
还有一些帖子就不一一列举了. 这些高频题基本都是一个套路,把定义问题和选择metrics答出来就成功了一半了,后面就是发散思维,从各个segment展开分析了。
帖子之后,我又仔细了看了三遍著名的a collection of data science take-home challenges(地里现在有免费下载版,我当时是自己买的,十分肉疼),第一遍的时候边看边觉得花了几百刀就买了这个?看到后面大概有点product的感觉了,后面又仔细看了两遍,觉得写书的人确实是业界出身有丰富的DS经验,回答问题言简意赅,很值得参考。
因为书快100页了,我大概缩减成了几类问题,以下答案都是书里的浓缩,:.1point3acres
1)经典题型 15% Drop in FB group usage:1) Clarify: what specifically dropped (metric used), by how much (practically significant/statistically significant)? -- if not significant then no need to go on 2) Then High Level: a. Is it one-time or progressively? (One-time significant drop could be tech issue) -- One time is highly likely a tech issue. Seasonal is also ok b. Does the drop happen in other features? -- If also other features then we have a bigger problem c. Cannibalization d. Drop also happen in competitor products? Maybe competitor launched something new? -- if yes then may be a cross platform industry issue 3) Then Deep Dive (if anything changes in one of the segments; or maybe nothing changed but the distribution changed): a. New user vs old user (Cohort) b. Language c. Country d. Platform
2)How to improve the product? The question is not asking you to be visionary. But to check if you can find things from datasets as a data scientist. Always try to incentivize “good” and dis-incentivize “bad”. 1)Firstly, define the target. Say engagement (in order to move long-term retention and revenue) 2)Then choose metric used to evaluate engagement: i.e. the proportion of users who take at least one action per day interacting with the site. 3)Pick variables that would move the metric: use both user characteristics and user behavior 4)Use model (random forest is good here) to check the relationship between segment and engagement. Come up with several scenarios to explain and make suggestions based on the results (improve which segment)
3)Fake/Fraud detection:Key with fraud is, not happening only once. People who commit fraud would like to repeat it if not being caught. all variables are really about something that should be unique but is not or extreme values. Hence two main ways to capture fraud: 1) Same device IP/Bank account/phone number as existing accounts; 2) Anomaly detection-find outliers (extremely low price) Ø More specifically with market place posting, we can address the listing and seller. For listing, pictures cannot be stolen from elsewhere/descriptions cannot be copied/resolution should not be too low/price should not be two Ø With fake profile (say fake school): using ML algorithm or anomaly detection to find outliers. For instance, you may include the percentage of connections went to the same school/interaction with people from the same school/acceptance rate for the same school request as variables. In order to minimize the fake profile, you may want to use 2-step verification for risky users (minimize false negative you may not apply this to all users).
4)What features to add?Again, not tempted to be a visionary. Starting from the datasets. Look at current data and check where you want to incentivize people to do. Then simplify the procedures. You can also learn from customer needs through complaints or comments. Then A/B testing to see if it can satisfy your needs. Eg: figure out a way for a user to finish things in one click/ check use case to find opportunities
5)Should we introduce XXX feature?Layer of logic: 1) If add, what benefits will we get? 您好! 本帖隐藏的内容需要积分高于 188 才可浏览 您当前积分为 0。 使用VIP即刻解锁阅读权限或查看其他获取积分的方式 游客,您好! 本帖隐藏的内容需要积分高于 188 才可浏览 您当前积分为 0。 VIP即刻解锁阅读权限 或 查看其他获取积分的方式 size="2">
4 Quantitative Analysis 这一块我觉得学过概率论和数理统计的同学们应该毫无问题。 贝叶斯公式P(A|B)基本必考,然后一些confidence interval, p-value,A/B testing会考,再就常见metrics的distribution(exponential分布就是答案) 还有那个25做一次广告和4%做一次广告的期望和方差也是近几个月经常见。 . From 1point 3acres bbs
我觉得基本的统计看一遍就没什么问题。常见分布的期望方差都要熟悉。
5 Technical Analysis 这一部分就是大量反复的练习,sql其实很简单,但是临场会紧张,时间又短,所以必须平时经常写。我自己统计了一下,大概常见题每题写了3-5遍,总共在白板上练习了150+的sql. 即便是觉得自己毫无问题,现场还是有些紧张,如果平时写的不多的话,很有可能会卡掉。
. ----
我自己的练习顺序是 2) a collection of data science take-home challenges里的sql写了两遍 3) 地里有一个小姐姐整理了一个很好的帖子 可以反复练习 这个帖子要的阅读权限有点高,不太好意思直接贴别人的内容贴,不够的宝宝们坚持日常打卡挣分吧。 https://www.1point3acres.com/bbs/forum.php?mod=viewthread&tid=432902&highlight=fb%2Bsql 4)临onsite前又有一个综合帖子,里面整理的也不错,sql也练习了两三遍。 https://www.1point3acres.com/bbs/thread-472684-1-1.html
基本上sql常写常新,有时写着写着就发现有点问题,光看别人写的可能看不出来。
6 Summary. Χ
小结就是一分耕耘一分收获,勤练总有好结果。虽然最后因为immigration team耽误了,还是觉得这个过程收益良多。
听朋友见FB DS今年的total compensation能到200k+ 羡慕嫉妒恨 在面试的朋友们加油啊!祝大家新年好运!
|