May 2009 - May 2017 论坛八周年-你的足迹,我的骄傲



查看: 2828|回复: 8


[复制链接] |试试Instant~ |关注本帖
chenwoo 发表于 2015-4-16 11:29:51 | 显示全部楼层 |阅读模式

2015(4-6月) 码农类 硕士 全职@LiveRamp - 内推 - 技术电面 |Failfresh grad应届毕业生


您需要 登录 才可以下载或查看,没有帐号?获取更多干活,快来注册


第一个是what‘s your most Challenging project?这个题目我答得不好,估计跪在这里了

第三道题目:given 1 terabyte key-value data, how do you store them so that insert, find, delete operations can be very fast?
这个问题我的回答是把这个大的file分成10,000份,then to the remainder that hashcode of each element modulo 10.000, we decide which element is in which small file。
如果有十台机器的话,先把这个big file 划分到10 台机器上,然后再载每台机器上划分成10,000份, 每一份再hash。


下面是six degree的题目答案:

We can treat the problem as a graph. Each person represents a node. and if two people are friends, there is a edge between the two nodes that represent them. The problem becomes that find the shortest path between two node: start node and end node.

We can adopt BFS, Dijkstra algorithm,  Bidirectional BFS.
|E| is the number of edges, and |V| is the number of vertex.
1   BFS runs in time O(|V| + |E|)
2   Dijkstra Algorithm with a priority queue runs in time O(|E| + |V|log|V| )
Since the weight of each edge in the search grap is same, Dijkstra algorithm  will degenerate into BFS. Besides, it needs more data structre than BFS to implement it. For example, Dijkstra algorithm needs priority queue to keep track of every vertex and distance array to keep track of the distance from source vertex to other vertex.
3   Bidirectional BFS is better than BFS.
I will talk about why Bidirectional BFS is better than BFS below.
assume the distance between source to target is k, and the branch factor is B [every vertex has B edges].
BFS will open: 1 + B + B^2 + ... + B^k vertices.
bi-directional BFS will open: 2 + 2B + 2B^2 + 2B^3 + .. + 2B^(k/2) vertices.
for large B and k, the second is obviously much better the the first.

I will choose bidirectional BFS.

Algorithm idea: do a BFS search simultaneously from the source and the target level by level. (level 0 from source and target respectively , level 1 from source and target respectively....)    The algorithm will end when the level from source meets the level from the target.

We will use the following data structures:
1   two queues to do BSF respectively from source and target node
2   we need class Node to represent every node in the graph.
Class Node {
        public String name;
        public ArrayList<Node> neighbors; //
        public ArrayList<Node> predecessors; // to keep track of predecessors of every node in order to construct the shortest path after Bi-                         BFS
        public boolean visited; // label the visited node
        public boolean visitedFromStart; //label the node that are visited from start node

Below is my java code to implement the bidirectional BFS. After we run the bidirectional BFS., we can construct the shortestt path by use of  predecessors of every node form start node to end node.

import java.util.ArrayList;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.Queue;
-google 1point3acres

public class Solution {
        public class Node {
                public String name;
                public ArrayList<Node> neighbors;
                public ArrayList<Node> predecessors;
                public boolean visited;
                public boolean visitedFromStart; //label node that are visited from start node

public        void Bi_BFS(Node start, Node end) {
                Queue<Node> Q1 = new LinkedList<Node>();
                Queue<Node> Q2 = new LinkedList<Node>();
                start.visited = true;
                start.visitedFromStart = true;
                end.visited = true;
                while (!Q1.isEmpty() && !Q2.isEmpty()) {
                        int LevelSize1 = Q1.size();
                        for (int i = 0; i < LevelSize1; i++) {
                                Node front1 = Q1.poll();
                                for (Node next : front1.neighbors) {
                                        if (next.visited == false) {
                                                next.visited = true;
                                                next.visitedFromStart = true;
                        int LevelSize2 = Q2.size();
                        for (int i = 0; i < LevelSize2; i++) {
                                Node front2 = Q2.poll();
                                if (front2.visitedFromStart ) {
                                        //we find the shortest path
                                for (Node next : front2.neighbors) {
                                        if(next.visited == false) {
                                                next.visited = true;

. visit for more.

下面是max stack from scratch 鏉ユ簮涓浜.涓夊垎鍦拌鍧.
. From 1point 3acres bbs
  • class MaxStack {
  •     private Stack<Integer> stk = new Stack<Integer>();
  •     private Stack<Integer> maxStack = new Stack<Integer>();
  •     public void push(int x) {
  •         if (maxStack.empty() == true || x >= maxStack.peek()  ) {
  •             maxStack.push(x);
  •         } //if the input element is greater than or equal to maxStack.peek, then                          //        put it into maxStack
  •         stk.push(x);
  •     }
  •     public void pop() {
  •         int result = stk.pop();
  •         if ( result == maxStack.peek() ) {
  •             maxStack.pop();
  •         }
  •     }
  •     public int top() {
  •         return stk.peek();
  •     }
  •     public int getMax() {
  •         return maxStack.peek();
  •     }
  • }

. visit for more.

下面是 find the kth largest element in an array 的答案

    • We can find kth largest element in time complexity better than O(nLogn). A simple optomization is to create a Max Heap of the given n elements and call extractMin() k times.         
      Time complexity of this solution is O(n + kLogn) . when k is much less than n, time complexity is O(n), when k is close to n, it is O(n)
    • we can use a min heap to find kth largest element.
      step 1: Build a Min-Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array         -google 1point3acres
      step 2: then we iterate through the given array from arr[k] to arr[N-1]; during each iteration, we compare the current element with the root of the Min-Heap

      • if element is greater than or equal to the root, use it to replace the root and call the heapify function for the Min-Heap
      • if element is less than the root, we ignore it.

step 3 the root is the k largest element.
time complexity is O(k + (n-k) * logk );
  • Quickselect. 鍥磋鎴戜滑@1point 3 acres

  • Median-of-medians

一个文件找出unique words 的数目?如果是1GB?   hashset    1TB? 分布式系统design

. Waral 鍗氬鏈夋洿澶氭枃绔,
思路2: 根据STRING产生HASHCODE,再和总机器数量去余数,根据余数可以把一样的STRING都分配到一个小机器上,每台机器分组HASHSET 先把小集合里面重复的都去掉 汇总应该就可以了
assume we have n computers, then we scan the big file, and compute the hashcode of each string, then  hashcode modulo n (the number of computers) , we get a remainder. according to the remainder, we assign the string to respective computer. after scan the whole file,  each mechine will use a hashtable to count the number of unique string on that mechine. Then we add up the number from all mechines. and we get the total number of unique string.
If there are a lot of strings on one mechine, we can divide the strings into M chunks, for each chunk, we use a hashtable to find out the unique strings and put them into a file.  then merge these small files into a big file.  Then we do the same operation on the big file recursively until there are no duplicates and we count the number..1point3acres缃
思路3: 1. 采样1% string,数据大的话实际上0.1%也成,排序。2.把这个排序的文件分成n份,记录下分点上的string。3.处理源数据,分成m块,每块uniq,然后对每个结果string,在2的结果上二分查找决定分到哪份,写下去

step 1 we take some sample strings from the big file. for example ,one percent or  zero point one percent (0.1 %) if the file is too big. Then we sort the sample strings.
step 2 we divide the sample strings into N intervals
step 3 we divdie the big file into M chunks, for each chunk we use a hashset to find out the unique strings. then for each unique string we use binary search to find its position in the smaple strings, put it into relitive interval.
step 4 we deal with the N intervals respectively.  for each intervals we use a hashmap to  find out the unique strings . Then  we add up the number of unique strings in each interval  and we get the total number of unique string.


cards on the table, X, Y, 1, 2. . WEach card has letter on one side and number on the other side. . Given the statement "The number side of X is even", at least how many cards need to flip to prove/disprove it?
-google 1point3acres
answer: we need to filp two cards, one is X card, the ohter is 1 card. if the other side of X card is odd, the statement is false. OR if the other side of 1 card is X, the statement is false. Otherwise, the statement is true;


. Waral 鍗氬鏈夋洿澶氭枃绔,
according to the law of large numbers (LLN), the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.
if your shooting probility is greater than 0.6 , I will choose ⅝.  if you are not gooding at playing basketball, I will choose ⅔, and you got that result  by chance.



. 鐗涗汉浜戦泦,涓浜╀笁鍒嗗湴




stevenlordiam 发表于 2015-4-16 11:45:15 | 显示全部楼层
看来不同人面的题不一样 我今天是Tenzi面的 还有新题==
回复 支持 反对

使用道具 举报

 楼主| chenwoo 发表于 2015-4-16 11:50:53 | 显示全部楼层
stevenlordiam 发表于 2015-4-16 11:45
看来不同人面的题不一样 我今天是Tenzi面的 还有新题==

嗨,我跪在旧题上了,太遗憾了,祝福你move on and get the offer
回复 支持 反对

使用道具 举报

stevenlordiam 发表于 2015-4-16 12:19:08 | 显示全部楼层
chenwoo 发表于 2015-4-16 11:50
嗨,我跪在旧题上了,太遗憾了,祝福你move on and get the offer
-google 1point3acres
我也没回复。。。看了几个今天的都没回复 看来真的是在练习面试官
回复 支持 反对

使用道具 举报

 楼主| chenwoo 发表于 2015-4-16 21:56:32 | 显示全部楼层
stevenlordiam 发表于 2015-4-16 12:19
我也没回复。。。看了几个今天的都没回复 看来真的是在练习面试官
. 鍥磋鎴戜滑@1point 3 acres
回复 支持 反对

使用道具 举报

 楼主| chenwoo 发表于 2015-4-16 21:56:38 | 显示全部楼层
stevenlordiam 发表于 2015-4-16 12:19
我也没回复。。。看了几个今天的都没回复 看来真的是在练习面试官

回复 支持 反对

使用道具 举报

chenwenzhejob 发表于 2015-4-17 14:15:00 | 显示全部楼层
楼主不要灰心,我第二面被拒。 其实不管你答得多好都是没用的。 一开始我不相信,后来我给出很好的答案还悲剧就相信了。 加油!
回复 支持 反对

使用道具 举报

 楼主| chenwoo 发表于 2015-4-17 21:57:58 | 显示全部楼层
chenwenzhejob 发表于 2015-4-17 14:15
楼主不要灰心,我第二面被拒。 其实不管你答得多好都是没用的。 一开始我不相信,后来我给出很好的答案还悲 ...

Thx, 同加油啦
回复 支持 反对

使用道具 举报

面假空虚 发表于 2015-10-22 17:14:54 | 显示全部楼层
Dijkstra的复杂度写错了吧?是O(|V|+|E|*lg|V|), 前面是初始化所有点,后面是最多做总边树次min heap的reduce key操作。
回复 支持 反对

使用道具 举报



一亩三分地推荐上一条 /5 下一条

手机版|小黑屋|一亩三分地论坛声明 ( 沪ICP备11015994号 )

custom counter

GMT+8, 2017-5-26 17:26

Powered by Discuz! X3

© 2001-2013 Comsenz Inc. Design By HUXTeam

快速回复 返回顶部 返回列表