0
00:00:00,400 --> 00:00:00,240
Interview with Bram Cohen, developer of BitTorrent

1
00:00:05,400 --> 00:00:10,240
Well there's is the universe which did happen 
and the universe which should have happened,

2
00:00:10,760 --> 00:00:18,200
and pretty much everyone seems to agree 
that in the universe which should have happened,

3
00:00:21,720 --> 00:00:32,480
BitTorrent was written 
by some venture capital backed company 

4
00:00:33,880 --> 00:00:41,800
which then got sued and kinda sort of acquired 
after getting sued into oblivion

5
00:00:42,000 --> 00:00:44,600
and the CEO having a scare of almost having been

6
00:00:44,600 --> 00:00:47,080
thrown in prison, or actually being thrown in prison. 

7
00:00:48,560 --> 00:00:53,600
Of course in the universe which actually happened,

8
00:00:53,600 --> 00:00:57,040
- which doesn't really make any sense 
so it isn't really worth talking about.

9
00:00:57,040 --> 00:01:03,600
In the universe which actually happened, 
BitTorrent was written by some guy 

10
00:01:03,640 --> 00:01:09,920
in his living room who was living off of credit cards 
and became explosively huge,

11
00:01:10,040 --> 00:01:13,800
prior to taking any investment whatsoever.

12
00:01:13,800 --> 00:01:22,440
And then the person who wrote it 
somehow wound up working with Hollywood.

13
00:01:22,680 --> 00:01:30,520
And running a very legal business 
without any legally scary situations.

14
00:01:31,920 --> 00:01:39,840
When you have television or radio 
and by that I mean television over the airwaves,

15
00:01:40,200 --> 00:01:46,400
It's kinda like someone is screaming really really loudly 
and everyone else can listen.

16
00:01:46,880 --> 00:01:49,760
And so anyone who wants to can tune in

17
00:01:49,800 --> 00:01:53,760
Now this is not very efficient from an energy standpoint

18
00:01:53,800 --> 00:01:56,600
and it involves some very very loud screaming.

19
00:01:57,200 --> 00:02:02,280
It is highly effective at broadcasting signals to space aliens.  

20
00:02:02,280 --> 00:02:10,520
If there were any aliens hanging out around earth in the 1900s,

21
00:02:10,560 --> 00:02:13,960
they'd have had a very very easy time watching television.

22
00:02:13,960 --> 00:02:16,280
And getting a good look at our culture that way, 

23
00:02:16,320 --> 00:02:19,240
because that's where most of the signal was going, 
was out into space.  

24
00:02:23,640 --> 00:02:29,440
The internet doesn't work that way - on the internet

25
00:02:29,560 --> 00:02:32,000
you have all these machines which are connected to the net

26
00:02:32,200 --> 00:02:37,360
and you can send a message to anyone else on the net,

27
00:02:37,360 --> 00:02:41,720
but it goes to just them - there's no broadcast concept there. 

28
00:02:42,040 --> 00:02:50,280
So there's this question of how do you make broadcasting work? 

29
00:02:50,800 --> 00:02:58,400
Unfortunately broadcasting can be rather difficult 
in that if you have something that's very popular

30
00:02:58,480 --> 00:03:06,000
and very large and very high quality, 
it can become very expensive to distribute it.

31
00:03:06,040 --> 00:03:11,920
It becomes expensive to be popular. 
With broadcast over the airwaves this doesn't happen,

32
00:03:12,560 --> 00:03:16,600
You're screaming loud enough, 
everybody can hear you - not a problem. 

33
00:03:17,080 --> 00:03:22,040
On the internet it's a huge problem so BitTorrent was based

34
00:03:22,040 --> 00:03:25,600
on this very fundamental calculation of 
well if you're sending out something, 

35
00:03:25,600 --> 00:03:29,720
and everyone wants the same thing, 
they can just send it to each other, 

36
00:03:30,960 --> 00:03:34,760
So there's this logistical problem of how to make that happen,

37
00:03:35,200 --> 00:03:36,720
so I figured out how to make it happen. 

38
00:03:37,320 --> 00:03:41,440
Well the basic problem is a pretty simple one,

39
00:03:41,440 --> 00:03:45,760
there's plenty of upload capacity out there 
not being used how do we use it?

40
00:03:45,800 --> 00:03:52,000
So that's a trivial calculation on its own the problem is,

41
00:03:52,000 --> 00:03:54,920
these are what you call low quality resources.

42
00:03:54,920 --> 00:04:01,640
They're peers, they're untrusted, 
they are of unknown potential transfer rate

43
00:04:01,640 --> 00:04:09,680
they're not of terribly good transfer rate to begin with
and they're not very coordinated.

44
00:04:09,760 --> 00:04:13,600
This isn't really so much of a problem of make something that works, 

45
00:04:13,680 --> 00:04:16,800
so much as make something that works reliably,

46
00:04:16,800 --> 00:04:21,480
that can handle the fact that peers just sort of disappear  
and never come back again.

47
00:04:22,200 --> 00:04:30,000
When I started working on it there was a bunch of people 
working on very much the same thing.

48
00:04:30,000 --> 00:04:35,680
I decided on an approach that was actually 
much, much more ambitious

49
00:04:35,720 --> 00:04:39,640
than a lot of the things that had been successful 
up until that time.

50
00:04:39,680 --> 00:04:45,440
In that you when you're distributing 
something on the internet,

51
00:04:45,720 --> 00:04:50,000
if you're doing it via HTTP, you kinda don't want everyone

52
00:04:50,000 --> 00:04:52,320
to come and download the same thing at the same time.

53
00:04:52,320 --> 00:04:56,840
You want nice small things 
that are distributed around when people download them.

54
00:04:57,600 --> 00:05:03,000
and this is good for making it so 
you don't have too much load on the one central server,

55
00:05:03,200 --> 00:05:10,000
and I went and did this calculation and figured, 
well no, I want to do the exact opposite thing.

56
00:05:10,560 --> 00:05:13,320
I want everyone downloading the same thing at the same time.

57
00:05:13,560 --> 00:05:18,680
Because if, and this at the time this was a pretty big if,
if you can get a handle

58
00:05:18,720 --> 00:05:23,360
on all the difficult logistical problems of making it actually happen,

59
00:05:23,600 --> 00:05:29,600
then you can make it so that the initial place 
only has to upload one copy

60
00:05:30,200 --> 00:05:35,240
of the whole thing and everything else 
will be distributed between peers

61
00:05:35,240 --> 00:05:41,400
and you actually get maximum efficiency 
in the very situation you were trying to avoid

62
00:05:41,440 --> 00:05:43,600
when you were doing everything via HTTP. 

63
00:05:43,880 --> 00:05:51,000
So in that sense I was being rather ambitious 
although other people were working on the same problem.

64
00:05:51,000 --> 00:05:57,240
The difference was in terms of approach, 
that I came up with an architecture

65
00:05:57,240 --> 00:06:02,480
which was designed first
around reliability and efficiency

66
00:06:02,680 --> 00:06:09,920
- in fact only reliability and efficiency- 
it's an utterly bizarre architecture

67
00:06:09,640 --> 00:06:12,720
unless you consider it 
from the point of view of ok...

68
00:06:12,760 --> 00:06:16,240
first thing, 
let's just assume that peers are unreliable

69
00:06:16,440 --> 00:06:18,840
that we don't know what transfer rates are

70
00:06:19,440 --> 00:06:22,280
that peers are untrusted and tend to go away

71
00:06:22,280 --> 00:06:27,000
and then, out of what's left,
how do we make something work

72
00:06:28,080 --> 00:06:34,840
other people were trying the tree-based architectures, 
which proved not to work

73
00:06:34,920 --> 00:06:39,920
but the reasons why 
are very much centered around reliability,

74
00:06:40,000 --> 00:06:43,760
and are not obvious 
unless you've done some networking and know that just...

75
00:06:44,000 --> 00:06:45,880
peers go away and never come back

76
00:06:46,120 --> 00:06:49,720
DRM has a lot of political momentum
right now

77
00:06:50,280 --> 00:06:56,800
it's just like if you're putting content up online
you have to have DRM and...

78
00:06:57,480 --> 00:07:03,120
whether this is psychological,
whether technicals are saying it,

79
00:07:03,280 --> 00:07:06,520
whether lawyers are saying it,
whether just collectively everybody feels

80
00:07:06,640 --> 00:07:06,680
that somebody must be saying it,
so you have to speak in one voice demanding it...

81
00:07:13,280 --> 00:07:16,960
... is a little unclear,
and varies from place to place

82
00:07:17,640 --> 00:07:19,400
When you go to the movie theatre you pay

83
00:07:19,560 --> 00:07:23,120
when you get a DVD you pay,
when you rent a DVD you pay

84
00:07:26,600 --> 00:07:28,000
there are a few things going on 

85
00:07:28,000 --> 00:07:31,720
one of the big things is what people associate 
with their home experience is watching television

86
00:07:31,920 --> 00:07:33,760
You turn on the television
and then you watch,

87
00:07:34,080 --> 00:07:36,160
and people are rather disinclined

88
00:07:38,200 --> 00:07:40,120
in some ways

89
00:07:40,360 --> 00:07:42,120
paying for something to watch at home

90
00:07:42,360 --> 00:07:44,240
they want to just have ads and watch it

91
00:07:45,200 --> 00:07:47,720
by analogy with television

92
00:07:48,320 --> 00:07:50,680
another thing that happens is
that people are leery

93
00:07:50,960 --> 00:07:53,760
frequently leery of paying for anything online

94
00:07:54,040 --> 00:07:56,320
just putting in a credit card number,

95
00:07:56,640 --> 00:07:59,120
because credit cards are 
so fundamentally insecure

96
00:07:59,640 --> 00:08:01,960
makes people very nervous,
they don't want to do that,

97
00:08:02,360 --> 00:08:05,400
and it's an annoying process 
entering in your credit card number.

98
00:08:05,840 --> 00:08:09,320
Now the ridiculous insecurity of credit card numbers 
has a lot to do with

99
00:08:10,680 --> 00:08:12,960
with the ridiculous way banks work 
in the United States

100
00:08:13,280 --> 00:08:17,920
where things are big and bloated
and poorly done technologically

101
00:08:18,520 --> 00:08:21,280
and there's very little if any incentive to fix it.

102
00:08:22,840 --> 00:08:26,080
I would say that people are a little used to now,

103
00:08:26,320 --> 00:08:31,160
when they download videos from the net,
not paying for it,

104
00:08:31,400 --> 00:08:32,960
it's just what they've been doing,

105
00:08:33,560 --> 00:08:37,480
the paying for it model just hasn't been there!

106
00:08:38,680 --> 00:08:43,800
So people just habitually 
aren't very used to paying for things. 

107
00:08:44,120 --> 00:08:48,600
At a certain point when you get down 
to what's termed level of cost

108
00:08:49,720 --> 00:08:53,520
the actual price being charged is de minimis,

109
00:08:54,880 --> 00:08:57,520
whether that's a dollar, 
or ten cents or one cent.

110
00:08:58,000 --> 00:08:58,040


111
00:08:59,960 --> 00:09:01,200
it's a little hard to say.

112
00:09:03,040 --> 00:09:07,720
But at some point the actual price paid
ceases to be a concern.

113
00:09:09,080 --> 00:09:12,280
and that's for extremely popular content

114
00:09:13,400 --> 00:09:15,840
isn't very de minimis

115
00:09:15,800 --> 00:09:21,680
when you multiply it over 
the number of people who are paying it

116
00:09:22,440 --> 00:09:27,440
that's effectively what advertising online 
winds up being

117
00:09:27,720 --> 00:09:32,040
the monetization on it is small.

118
00:09:33,120 --> 00:09:35,200
Generally speaking a penny per impression,

119
00:09:36,600 --> 00:09:39,240
but winds up adding up in the end.

120
00:09:39,640 --> 00:09:43,240
The question there is 
what is the form of that monetization?

121
00:09:43,960 --> 00:09:48,080
Is it via advertising, which people are 
implicitly paying for in some way,

122
00:09:48,600 --> 00:09:55,760
Is it explicitly paying where 
there is this usability issue of making the payment,

123
00:09:57,480 --> 00:10:02,040
and the concern about fraudulent charges happening 
when this payment is happening

124
00:10:02,360 --> 00:10:05,360
concerns about incentivizing spam,
bladibladibla...

125
00:10:08,920 --> 00:10:10,600
So at a certain level 

126
00:10:10,880 --> 00:10:14,360
there's the cost of the distribution 
and there's the value gotten from the thing

127
00:10:14,640 --> 00:10:16,360
and if the cost

128
00:10:16,640 --> 00:10:20,560
is some very small fraction 
of the value gotten from the thing

129
00:10:20,960 --> 00:10:24,400
people cease to pay 
any attention to it whatsoever!

130
00:10:26,800 --> 00:10:30,720
And the question then  becomes 
what is the form of the payment.

131
00:10:31,480 --> 00:10:33,960
Advertising is certainly a compelling model 

132
00:10:34,280 --> 00:10:36,720
in that it's very very simple,

133
00:10:39,200 --> 00:10:42,840
Whenever you have payment 
going through a distributor

134
00:10:43,120 --> 00:10:46,080
there's the whole issue 
of making the payment happen

135
00:10:46,400 --> 00:10:50,200
both in terms of authorizing charges
and redistributing the money

136
00:10:50,520 --> 00:10:53,040
and it has to somehow 
hook into the payment system,

137
00:10:53,600 --> 00:10:57,040
and advertising is somewhat inefficient

138
00:10:57,440 --> 00:10:59,960
in that it implicitly hooks into the payment system 

139
00:11:00,280 --> 00:11:03,080
via some complicated route of 
people watching the advertising 

140
00:11:03,400 --> 00:11:05,440
doing something eventually,
somewhere out there,

141
00:11:06,040 --> 00:11:08,960
But it's much much simpler in general,

142
00:11:09,320 --> 00:11:11,000
a more straightforward way of doing things.