回复: 64
收起左侧

Facebook Data Scientist SQL面经+参考答案

   
本楼:   👍  26
100%
0%
0   👎
全局:   173
99%
1%
1

2018(7-9月) 分析|数据科学类 硕士 全职@facebook - Other - 技术电面 Onsite 在线笔试  | Other | 在职跳槽

注册一亩三分地论坛,查看更多干货!

您需要 登录 才可以下载或查看附件。没有帐号?注册账号

x
在地里看到的面筋有2个痛点:
1. 逻辑混乱,非常confusing (还不如不贴)
2. 没有答案

在这里我抛砖引玉,贴一些我整理+亲自经历过的电面SQL题, 并附上我的答案。希望能帮助到大家(求大米)。

1. There is atable that tracks every time a user turns a feature on or off, with columnsuser_id, action ("on" or "off), date, and time.

1) How manyusers turned the feature on today?
USER_ID || ACTION||DATE||TIME


SELECTCOUNT(DISTINCT USER_ID)  
FROMTABLE
WHEREDATE = CURDATE()
ANDACTION = 'on';

How manyusers have ever turned the feature on?
SELECTCOUNT(DISTINCT USER_ID)
FROMTABLE
WHEREACTION = 'on';

2) Create atable that tracks the user last status every day.
SELECT A.DATE, B.USER_ID, B.STATUS
(SELECT GENERATE_SERIES('2018-01-01'::DATE,'2018-09-01'::DATE, '1D')::DATE) TABLEA(DATE)  
LEFT JOIN  
(SELECT * FROM
TABLE  
QUALIFY ROW_NUMBER() OVER (PARTITION BY USER_ID, DATEORDER BY TIME DESC)  =1
) B
ON TABLEA.DATE >= B.DATE
QUALIFY ROW_NUMBER() OVER(PARTITION BY A.DATE, B.USER_IDORDER BY B.DATE DESC) =1

3) In a tablethat tracks the status of every user every day, how would you add today's datato it?   
/*ASSUMINGACCOUNTS KEY ARE UNIQUE IN TABLE_TODAY */
/*OTHERWISEWE CAN PICK THE LAST ACTION IN TODAY AS THE STATUS*/

SELECTA.*
FROMEVERYDAY_SATUS
UNION
(
SELECT  
COALESCE(A.USER_ID,B.USER_ID) AS USER_ID
,CASEWHEN A.USER_ID IS NULL THEN B.STATUS
WHENA.USER_ID IS NOT NULL AND B.USER_ID IS NULL THEN A.ACTION
WHENA.USER_ID IS NOT NULL AND B.USER_ID IS NOT NULL THEN A.ACTION
END ASSTATUS
,CURDATE()AS DATE
FROM
TABLE_TODAYA
FULL OUTER JOIN  
TABLE_EVERYDAYB
WHEREA.USER_ID = B.USER_ID
ANDB.DATE= CURDATE()-1
. ----
);
.
/*OR*/
INSERTINTO TABLE_EVERYDAY
SELECT  
COALESCE(A.USER_ID,B.USER_ID) AS USER_ID
,CASEWHEN A.USER_ID IS NULL THEN B.STATUS
WHENA.USER_ID IS NOT NULL AND B.USER_ID IS NULL THEN A.ACTION
WHENA.USER_ID IS NOT NULL AND B.USER_ID IS NOT NULL THEN A.ACTION
END ASSTATUS
,CURDATE()AS DATE
FROM
TABLE_TODAYA
OUTERJOIN  
TABLE_EVERYDAYB
WHEREA.USER_ID = B.USER_ID
ANDB.DATE= CURDATE()-1

5), 如何找出在一天之内始终保持feature on的人,(given status table and action table).
/*The only case is: YESTEDAY WAS ON AND THERE IS NO ACTION TODAY*/
SELECTDISTINCT A.USER_ID  
FROMTABLE_EVERYDAY A
LEFTJOIN  
TABLE_TODAYB
ONA.USER_ID = B.USER_ID
您好!
本帖隐藏的内容需要积分高于 188 才可浏览
您当前积分为 0。
使用VIP即刻解锁阅读权限或查看其他获取积分的方式
游客,您好!
本帖隐藏的内容需要积分高于 188 才可浏览
您当前积分为 0。
VIP即刻解锁阅读权限查看其他获取积分的方式
of advertiser
ad_id is idof an ad being run by advertiser
spend isamount of money in $ that advertiser pays Facebook for ad-id to show it to FBusers.
price is how much the user_id spend through this ad.,assuming all prices > 0.  
Questions- The fraction of advertisers has at least one conversion. - What metrics wouldyou show to advertisers.  

Q1: What would the averageadvertiser spend on Facebook? Your query should return a single number.
SELECTAVG(TOT_SPEND) AS AVG_ADVSR_SPNT  
FROM
(SELECTADVERTISER_ID
,SUM(SPEND)AS TOT_SPEND
FROMADV_INFO
GROUP BY1) A

追问:afterwe get this table, what will the distribution look like?
NOT SURE ABOUT THIS QUESTION

SELECT   
A.ADVERTISER_ID
,A.SPENT
,COUNT(B.*) AS IMPRESSIONS
,COUNT(DISTINCT B.USER_ID) AS UNIQUE_USER
,SUM(B.PRICE) AS SALES
,A.SPENT/NULLIFZERO(IMPRESSION)AS COST_PER_IMP
,A.SPENT/NULLIFZERO(UNIQUE_USER)AS COST_PER_USER
,A.SPENT/Z
FROM
ADV_INFOA
INNERJOIN AD_INFO B
ONA.AD_ID = B.AD_ID
GROUPBY 1,2

Q3.Thefraction of advertisers has at least one conversion.  
SELECTCOUNT(DISTINCT CASE WHEN B.PRICE>0 THEN A.ADVERTISER_ID END)/COUNT (DISTINCTA.ADVERTISER_ID) AS PCTG_CONVERSION
FROM
ADV_INFOA
INNERJOIN AD_INFO B
ONA.AD_ID =B.AD_ID


评分

参与人数 116大米 +312 收起 理由
havefunhahaha + 1 很有用的信息!
旺仔牛奶不加糖 + 1 很有用的信息!
秋风扫落叶 + 2 给你点个赞!
Camelia + 1 给你点个赞!
Jack2020u + 1 很有用的信息!

查看全部评分


上一篇:优步take home求讨论
下一篇:C3IOT DS面经

本帖被以下淘专辑推荐:

serena2005 2018-10-3 03:30:08 | 显示全部楼层
本楼:   👍  2
100%
0%
0   👎
全局:   5
100%
0%
0
2) Create atable that tracks the user last status every day.
SELECT A.DATE, B.USER_ID, B.STATUS
(SELECT GENERATE_SERIES('2018-01-01'::DATE,'2018-09-01'::DATE, '1D')::DATE) TABLEA(DATE)  
LEFT JOIN  
(SELECT * FROM
TABLE  
QUALIFY ROW_NUMBER() OVER (PARTITION BY USER_ID, DATEORDER BY TIME DESC)  =1
) B
ON TABLEA.DATE >= B.DATE. check 1point3acres for more.
QUALIFY ROW_NUMBER() OVER(PARTITION BY A.DATE, B.USER_IDORDER BY B.DATE DESC) =1
..
这部分为什么要用left join呢?
回复

使用道具 举报

bergkamp 2018-10-1 13:24:41 | 显示全部楼层
本楼:   👍  2
100%
0%
0   👎
全局:   119
95%
5%
6
请问这个为什么要Join呢? 直接Select target_id, Count(Content_id) where target_id is not NULL (或者type=comment) 不可以吗?
根据题意:好像Comment只能是对Post的,不能对Comment进行Comment. 1point 3 acres

1) 有content_id, content_type (comment/ post),target_id。如果是comment,target_id就是post的content id,如果是post则target_id为NULL。求commentdistribution。
CONTENT_ID|| CONTENT_TYPE || TARGET_ID
-baidu 1point3acres123|| COMMENT || 100.
100|| POST || NULL  

SELECTNBR_COMMENT
,COUNT(CONTENT_ID) AS NBR_POST
FROM
(SELECTA.CONTENT_ID.google  и
,COUNT(B.CONTENT_ID) AS NBR_COMMENT
FROMTABLE A
LEFTJOIN TABLE B
ONA.CONTENT_ID = B.TARGET_ID
ANDA.TARGET_ID IS NULL
ANDB.TARGET_ID IS NOT NULL
GROUPBY 1) C
GROUPBY 1.google  и
ORDERBY 1

补充内容 (2018-10-1 14:00):. 1point3acres
周末脑子不灵光了,看来得多练。 有Post没有回复
回复

使用道具 举报

huixingzhijia 2019-1-10 08:23:00 | 显示全部楼层
本楼:   👍  1
100%
0%
0   👎
全局:   794
90%
10%
86
第三问第一题,如果没看到广告值为空,是不是NULL, 没看到是不是不能算成CTR 分母里面。.--
所以 是不是把count(*) 改成count(event). 这样如果是NULL就count 为0, 要不会count 为1.
回复

使用道具 举报

 楼主| wenyuanalive 2018-10-1 01:50:52 | 显示全部楼层
本楼:   👍  0
0%
0%
0   👎
全局:   173
99%
1%
1
混乱指的是题目混乱
回复

使用道具 举报

mynamesp 2018-10-1 08:26:44 | 显示全部楼层
本楼:    0
0%
0%
0  
全局:   57
100%
0%
0
谢谢楼主
回复

使用道具 举报

michaelroyean 2018-10-1 08:57:51 | 显示全部楼层
本楼:   👍  0
0%
0%
0   👎
全局:   17
71%
29%
7
感谢楼主分享
回复

使用道具 举报

jzhao59 2018-10-1 09:03:33 | 显示全部楼层
本楼:   👍  0
0%
0%
0   👎
全局:   6418
95%
5%
324
感谢楼主分享!!!
回复

使用道具 举报

serena2005 2018-10-1 22:08:37 | 显示全部楼层
本楼:   👍  0
0%
0%
0   👎
全局:   5
100%
0%
0
1) How manyusers turned the feature on today?
是不是需要考虑一下, 如果有人先turn it on, and change it back to off?  我觉得需要 count 每个用户最后的value
回复

使用道具 举报

wangshilin422 2018-10-2 06:31:20 | 显示全部楼层
本楼:   👍  0
0%
0%
0   👎
全局:   1
100%
0%
0
serena2005 发表于 2018-10-1 22:08
1) How manyusers turned the feature on today? ..
是不是需要考虑一下, 如果有人先turn it on, and chang ...

I think the variable 'time' could indicate this situation. it will show when this person turned it on and when he turned it off
回复

使用道具 举报

 楼主| wenyuanalive 2018-10-2 09:54:20 | 显示全部楼层
本楼:   👍  0
0%
0%
0   👎
全局:   173
99%
1%
1
serena2005 发表于 2018-10-1 22:08
1) How manyusers turned the feature on today? ..
是不是需要考虑一下, 如果有人先turn it on, and chang ...

取决于题目是怎么问的。如果是问最后的status,则应该考虑turn on ->off的人
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册账号
隐私提醒:
  • ☑ 禁止发布广告,拉群,贴个人联系方式:找人请去🔗同学同事飞友,拉群请去🔗拉群结伴,广告请去🔗跳蚤市场,和 🔗租房广告|找室友
  • ☑ 论坛内容在发帖 30 分钟内可以编辑,过后则不能删帖。为防止被骚扰甚至人肉,不要公开留微信等联系方式,如有需求请以论坛私信方式发送。
  • ☑ 干货版块可免费使用 🔗超级匿名:面经(美国面经、中国面经、数科面经、PM面经),抖包袱(美国、中国)和录取汇报、定位选校版
  • ☑ 查阅全站 🔗各种匿名方法

本版积分规则

Advertisement
>
快速回复 返回顶部 返回列表