约会网站背后的运作规律曝光：怎样才能和对的人“一拍即合”？

Love English 2 2022-12-23

收录于合集

#演讲 204 个

#英文演讲 162 个

#TED 140 个

#约会网站 1 个

Love English 2 助大家快乐学英语！

点开上方链接有惊喜！

演讲者：Christian Rudder

演讲题目：Inside OKCupid: The math of online dating

Hello, my name is Christian Rudder, and I was one of the founders of OkCupid. It's now one of the biggest dating sites in the United States. Like most everyone at the site, I was a math major, As you may expect, we're known for the analytic approach we take to love. We call it our matching algorithm. Basically, OkCupid's matching algorithm helps us decide whether two people should go on a date.

大家好，我叫Christian Rudder，我是OKCupid网站的创办人之一。这个网站现在已经是全美最大的交友网站。就象这网站上大多数其他人一样，我是学数学的，正如你所期待的那样，我们擅于分析。我们把这方法也应用在爱情上。我们把它叫做“配对算法”。基本上OK Cupid的配对算法能帮助我们决定两个人是否应该约会。

We built our entire business around it. Now, algorithm is a fancy word, and people like to drop it like it's this big thing. But really, an algorithm is just a systematic, step-by-step way to solve a problem. It doesn't have to be fancy at all. Here in this lesson, I'm going to explain how we arrived at our particular algorithm, so you can see how it's done.

我们的整个业务都是基于这一点。“算法”这个词说起来专业而高级，大家喜欢把它想成很大的一件事，但其实，算法只不过是一个系统的，一步一步的解决问题的方法。根本没有那么复杂。现在，我将为大家解释我们怎样得出这一个特殊的算法。你会在这看到它是怎样成形的。

Now, why are algorithms even important? Why does this lesson even exist? Well, notice one very significant phrase I used above: they are a step-by-step way to solve a problem, and as you probably know, computers excel at step-by-step processes. A computer without an algorithm is basically an expensive paperweight. And since computers are such a pervasive part of everyday life, algorithms are everywhere. The math behind OkCupid's matching algorithm is surprisingly simple.

为什么算法如此重要？为什么我们要有这堂课？请注意我刚才提到的一个很重要的词：它们是一种"逐步"解决问题的方法，你或许也知道，电脑擅长于一步一步的运算过程。没有算法的电脑，基本上只是一个昂贵的镇纸。既然电脑已经普及到我们的日常生活，算法是无处不在。OK Cupid配对算法背后的数学逻辑是非常简单的。

It's just some addition, multiplication, a little bit of square roots. The tricky part in designing it was figuring out how to take something mysterious, human attraction, and break it into components that a computer can work with. The first thing we needed to match people up was data, something for the algorithm to work with. The best way to get data quickly from people is to just ask for it. So we decided that OkCupid should ask users questions, stuff like, "Do you want to have kids one day?" "How often do you brush your teeth?" "Do you like scary movies?" And big stuff like, "Do you believe in God?"

就是一些加法，乘法，再来一点平方根。不过，设计这套算法的关键部分，在于要找出那些神秘的人与人之间的相互吸引力，并把它解构成电脑可以工作的部分，我们要做的第一件事就把人和数据关联起来，这样算法才能生效。要最快的从人们那里得到数据，最好就是直接询问他们。我们决定OK Cupid应该向用户问问题，比如说：“你会想要小孩吗？”，“你多久刷一次牙？“，”你喜欢看恐怖电影么？”。也有严肃些的问题，比如：“你相信上帝么？”。

Now, a lot of the questions are good for matching like with like, that is, when both people answer the same way. For example, two people who are both into scary movies are probably a better match than one person who is and one who isn't. But what about a question like, "Do you like to be the center of attention?" If both people in a relationship are saying yes to this, they're going to have massive problems. We realized this early on, and so we decided we needed a bit more data from each question. We had to ask people to specify not only their own answer, but the answer they wanted from someone else. That worked really well. But we needed one more dimension.

目前有很多问题在进行同类型配对上都很合适，就是当双方的答案相同时。比如，两个人都喜欢看恐怖电影可能配对得更成功。而一个人喜欢，另外一个人不喜欢的情况下，适配度就差点。但如果碰到下面的问题：“你喜欢成为关注的中心么?”如果交往中的双方都回答是，那他们可有大问题了。我们很早就意识到了这一点，所以我们觉得需要在每个问题再收集多一些数据。我们不仅要人们回答自己的看法，也要他们回答他们期待对方如何回答。这方法很有效，不过我们还要再多加一个维度。

Some questions tell you more about a person than others. For example, a question about politics, something like, "Which is worse: book burning or flag burning?" might reveal more about someone than their taste in movies. And it doesn't make sense to weigh all things equally, so we added one final data point. For everything that OkCupid asks you, you have a chance to tell us the role it plays in your life. And this ranges from irrelevant to mandatory. So now, for every question, we have three things for our algorithm: first, your answer; second, how you want someone else -- your potential match -- to answer; and third, how important the question is to you at all.

有些问题能表达人们的与众不同之处。比如，关于政治的问题，“焚烧书籍或者国旗，哪个更糟糕？”这能展露人们电影口味之外的东西。同时，并不是所有问题都同等重要的，所以我们最后增加了一个数据点。任何OK Cupid的问题，你都可以告诉我们这问题对你的重要性，它的程度从“无关”到“必要”。现在，每一个问题，我们有三个资讯提供给算法：第一，你的答案；第二，你希望别人怎么回答；也就是你潜在的对象的答案；第三，这个问题对你有多重要？

With all this information, OkCupid can figure out how well two people will get along. The algorithm crunches the numbers and gives us a result. As a practical example, let's look at how we'd match you with another person. Let's call him "B." Your match percentage with B is based on questions you've both answered. Let's call that set of common questions "s." As a very simple example, we use a small set "s" with just two questions in common, and compute a match from that. Here are our two example questions. The first one, let's say, is, "How messy are you?" And the answer possibilities are: very messy, average and very organized. And let's say you answered "very organized," and you'd like someone else to answer "very organized," and the question is very important to you.

有了这些信息，OK Cupid可以知道两个人相处和谐程度如何。算法吃进数字,吐出答案。实际举例来说吧，看我们怎样把你和另外一个人进行配对，暂且称他为 “B”。你和B的适配度是基于你们双方都进行过回答的问题。姑且把这些共同问题称之为“s”。简单举例，我们用小样本的 “s”，只需两个共同回答过的问题，电脑会根据它算出适配度。这里是我们的两道简单问题：第一个是，“你有多杂乱无章?”可供选择的答案选项有非常杂乱无章，一般和非常有条理。我们假设你回答的是“非常有条理”，你期待别人的回答是“非常有条理”，并且对你来说，这个问题非常重要。

Basically, you're a neat freak. You're neat, you want someone else to be neat, and that's it. And let's say B is a little bit different. He answered "very organized" for himself, but "average" is OK with him as an answer from someone else, and the question is only a little important to him. Let's look at the second question, from our previous example: "Do you like to be the center of attention?" The answers are "yes" and "no." You've answered "no," you want someone else to answer "no," and the question is only a little important to you. Now B, he's answered "yes." He wants someone else to answer "no," because he wants the spotlight on him, and the question is somewhat important to him. So, let's try to compute all of this.

基本上你就是个井井有条的怪胎。你是整洁有条理的人，你也希望对方同样如此，就这样。我们假设B有些不同。他的回答是自己非常有条理，但是他也接受“一般”，如果别人是这样回答的话，这个问题于他而言不太重要。我们看第二个问题，就是我们最开始举例的：“你喜欢成为关注的中心么?”答题项只有“是”或者“否”。现在你的回答是“否”，你希望别人怎样回答这栏答的是“否”这个问题对于你不太重要。而B呢，他自己的回答是“是”，他希望别人回答“否”，因为他希望所有焦点都在他身上，而这个问题对他还算重要。现在，我们让电脑来处理一切。

Our first step is, since we use computers to do this, we need to assign numerical values to ideas like "somewhat important" and "very important," because computers need everything in numbers. We at OkCupid decided on the following scale: "Irrelevant" is worth 0. "A little important" is worth 1. "Somewhat important" is worth 10. "Very important" is 50. And "absolutely mandatory" is 250. Next, the algorithm makes two simple calculations. The first is: How much did B's answers satisfy you? That is, how many possible points did B score on your scale? Well, you indicated that B's answer to the first question, about messiness, was very important to you.

我们的第一步是，既然我们要用电脑来处理它，我们就需要给一些数值来定义比如“还算重要”和“非常重要”，因为电脑需要把所有资料都转化成数字。在OK Cupid上我们按如下级别：“无关”是 0，“不太重要”的值是1，“还算重要”的值是10，“非常重要”的值是 50，“绝对必要”的值是250。接下来，算法要做两个简单的计算。第一个是你对B的回答给多少分，另外一个是，你给对方答题的满分是多少？你可以指定B的答案在第一个有关条理性的问题上，对你是非常重要。

It's worth 50 points and B got that right. The second question is worth only 1, because you said it was only a little important. B got that wrong, so B's answers were 50 out of 51 possible points. That's 98% satisfactory. Pretty good. The second question the algorithm looks at is: How much did you satisfy B? Well, B placed 1 point on your answer to the messiness question and 10 on your answer to the second. Of those 11, that's 1 plus 10, you earned 10 -- you guys satisfied each other on the second question. So your answers were 10 out of 11 equals 91 percent satisfactory to B. That's not bad. The final step is to take these two match percentages and get one number for the both of you.

它值50分，B答对了。第二个问题只有1分，因为你说这问题对你不太重要，B 答错了。所以B的回答在51分满分里拿到了50分。适配满意度是98%。非常好。算法的第二个问题是看B对你的满意程度。B给对于你有关条理性的回答给1分，对于第二个问题的答案给10分。满分11分，就是1+10。你得到了10分，在第二个问题上，你俩彼此都满意。你的回答在B的满意度分数是10/11，百分比是91%。还不错。最后一步是把两个适配度百分比放在一起，为你们两打一个分数。

To do this, the algorithm multiplies your scores, then takes the nth root, where "n" is the number of questions. Because s, which is the number of questions in this sample, is only 2, we have: match percentage equals the square root of 98 percent times 91 percent. That equals 94 percent. That 94 percent is your match percentage with B.

为得到这点，算法把你们两人的得分相乘，然后开n次方根，n就是问题的数目。因为“s”——也就是问题的数目，在这个例子里,只是“2”，我们得到的适配度百分比等于98%乘以91%再开平方根。结果等于94%。94%就是你和B之间的适配度百分比。

It's a mathematical expression of how happy you'd be with each other, based on what we know. Now, why does the algorithm multiply, as opposed to, say, average the two match scores together, and do the square-root business? In general, this formula is called the geometric mean. It's a great way to combine values that have wide ranges and represent very different properties. In other words, it's perfect for romantic matching.

这是通过数学方法来表达——你们彼此之间相处的愉快程度是怎样。基于我们所知道的信息。为什么算法要相乘，而不是除？比如，把两个分数求平均值以后再开平方根？总的来说，这个公式叫几何平均数，它很适合处理差异很大的数据，以及代表不同属性的数据。换句话说，它能完美的计算出浪漫爱情适配度。

You've got wide ranges and you've got tons of different data points, like I said, about movies, politics, religion -- everything. Intuitively, too, this makes sense. Two people satisfying each other 50 percent should be a better match than two others who satisfy 0 and 100, because affection needs to be mutual. After adding a little correction for margin of error, in the case where we have a small number of questions, like we do in this example, we're good to go. Any time OkCupid matches two people, it goes through the steps we just outlined.

你有大范围的，数不清的数据值，就像刚说过的，有关电影的，有关政治的，有关宗教的，有关所有的一切。凭直觉讲，以下情况很有道理。两个人彼此的满意度是50%，会好过那些两个人彼此满意度是0或者100的。因为爱慕应该是互相的。在增加了对误差幅度的小修改后——这种情况在问题量很小的时候会出现，就像我们刚举的运算实例一样——这套算法就可以运作了。任何时候当OK Cupid将两个人配对时，它按照我们刚介绍的步骤来運作。

First it collects data about your answers, then it compares your choices and preferences to other people's in simple, mathematical ways. This, the ability to take real-world phenomena and make them something a microchip can understand, is, I think, the most important skill anyone can have these days. Like you use sentences to tell a story to a person, you use algorithms to tell a story to a computer. If you learn the language, you can go out and tell your stories. I hope this will help you do that.

首先它收集你的答题的数据，然后它比较你的选项和你期待的对方选项，以简单的，数学的方法来进行。这种能将现实世界的现象，转化为电脑芯片能读取的数据的能力，我认为，是现代最重要的一种技术。就像你用话语来给一个人讲故事，你是用算法来跟电脑讲故事。如果你学会了这种语言，你就可以去讲故事了。我希望我刚才的介绍能帮助你做到这点。

来源：TED演讲

长按识别二维码可关注该微信公众平台