You are hereWhite House Energy Datapalooza: Data Market

White House Energy Datapalooza: Data Market


By DemAdmin - Posted on 16 October 2012

See video

Data Market helps business users find and understand data, and helps data providers efficiently publish their data and reach new audiences.

datamarket.com

Transcript with time markers:
1
00:00:00,200 --> 00:00:04,100
Speaker:
So open data, a lot of fantastic
data has been opened up by the

2
00:00:04,100 --> 00:00:06,734
public sector all over the
world over the course of the

3
00:00:06,734 --> 00:00:08,967
last few years.

4
00:00:08,967 --> 00:00:13,834
And when my slides come up, you
will actually see a blue circle

5
00:00:13,834 --> 00:00:16,533
that represents all of the
great open data that has been

6
00:00:16,533 --> 00:00:17,533
opened up.

7
00:00:17,533 --> 00:00:19,567
There's actually more of
it out than most of us --

8
00:00:19,567 --> 00:00:20,934
most of us realize.

9
00:00:20,934 --> 00:00:24,900
And what we've seen so far in
terms of using that data is that

10
00:00:24,900 --> 00:00:28,834
we've seen entrepreneurs, clever
developers and designers take

11
00:00:28,834 --> 00:00:32,967
small parts of that data
and make apps out of it.

12
00:00:32,967 --> 00:00:36,600
You know, and making that data
super useful, super engaging,

13
00:00:36,600 --> 00:00:38,033
super easy to use.

14
00:00:38,033 --> 00:00:40,700
And there are actually quite a
lot of these applications out

15
00:00:40,700 --> 00:00:44,166
there, each taking their own
little pocket of the data that

16
00:00:44,166 --> 00:00:45,400
has been opened up.

17
00:00:45,400 --> 00:00:47,567
So here we have the slides.

18
00:00:47,567 --> 00:00:50,867
Here is the world of open data,
here's the data taken by a

19
00:00:50,867 --> 00:00:54,065
single app and made
useful to us, you know,

20
00:00:54,066 --> 00:00:56,166
in an individual way,
and that's fantastic.

21
00:00:56,166 --> 00:00:58,199
And there are quite a lot
of those apps out there,

22
00:00:58,200 --> 00:01:00,300
each taking their own
little pocket of open --

23
00:01:00,300 --> 00:01:03,867
of the world of open data and
making it easy for us to use,

24
00:01:03,867 --> 00:01:06,899
super useful, engaging,
and even fun to use,

25
00:01:06,900 --> 00:01:08,100
as we've seen this morning.

26
00:01:08,100 --> 00:01:09,533
And that's great.

27
00:01:09,533 --> 00:01:12,066
But -- and the fact of the
matter is the bulk of this data

28
00:01:12,066 --> 00:01:15,934
sets that known to few
and used by even fewer,

29
00:01:15,934 --> 00:01:19,333
it just lies there waiting
for -- waiting to be --

30
00:01:19,333 --> 00:01:21,533
waiting to be used
and turned into value.

31
00:01:21,533 --> 00:01:25,233
Let's take the world of
energy data as an example.

32
00:01:25,233 --> 00:01:29,266
So when Taut [phonetic] and his
group kicked this off back in

33
00:01:29,266 --> 00:01:32,500
May, they pointed us to a
lot of different open energy

34
00:01:32,500 --> 00:01:34,066
data sources.

35
00:01:34,066 --> 00:01:36,333
We have the EPA's website,
they have their own system,

36
00:01:36,333 --> 00:01:37,967
a lot of interesting data there.

37
00:01:37,967 --> 00:01:41,000
The EIA has several different
systems on their website,

38
00:01:41,000 --> 00:01:43,667
the data is in different
places and so on.

39
00:01:43,667 --> 00:01:45,800
We have Department of Energy.

40
00:01:45,800 --> 00:01:49,433
We have the National
Renewable Energy Laboratory.

41
00:01:49,433 --> 00:01:51,533
We have climate change data.

42
00:01:51,533 --> 00:01:53,800
We have more from the
DOE, a different system.

43
00:01:53,800 --> 00:01:56,166
We have IPCC data, and
so on and so forth.

44
00:01:56,166 --> 00:01:59,433
And these are just a few of
the sources that are out there.

45
00:01:59,433 --> 00:02:02,000
And not only are they
in different places,

46
00:02:02,000 --> 00:02:03,367
they're also in
different formats.

47
00:02:03,367 --> 00:02:06,500
We'll have an actual sheet here,
a PowerPoint there, a PDF there,

48
00:02:06,500 --> 00:02:10,667
there's SPSS files, CSV files,
and a different XML format,

49
00:02:10,667 --> 00:02:12,033
and so on and so forth.

50
00:02:12,033 --> 00:02:15,000
Let alone all the proprietary
systems that have their own data

51
00:02:15,000 --> 00:02:17,200
models and their
own APIs and so on.

52
00:02:17,200 --> 00:02:20,200
So not only is the data
heterogeneous in location,

53
00:02:20,200 --> 00:02:22,399
it's also heterogeneous
in format.

54
00:02:22,400 --> 00:02:25,033
And this is actually quite
familiar to us at Data Market.

55
00:02:25,033 --> 00:02:28,566
What we do, what our technology
does is it takes heterogeneous

56
00:02:28,567 --> 00:02:31,700
data from heterogeneous
data sources, reads it in,

57
00:02:31,700 --> 00:02:35,333
normalizes it, and stores it
in our systems in a single data

58
00:02:35,333 --> 00:02:39,367
model, making it all super
easy to find and understand.

59
00:02:39,367 --> 00:02:42,899
So what we've done here is that
we've taken data from these

60
00:02:42,900 --> 00:02:45,433
great providers of
energy, energy data,

61
00:02:45,433 --> 00:02:48,400
we've taken all the quantitative
data that we could find,

62
00:02:48,400 --> 00:02:50,767
time series data,
survey data and so on,

63
00:02:50,767 --> 00:02:53,633
and made it all available
through a single portal.

64
00:02:53,633 --> 00:02:56,567
And in addition to some
of the U.S. sources,

65
00:02:56,567 --> 00:02:59,400
we've also added some of the
international open sources that

66
00:02:59,400 --> 00:03:00,600
are out there.

67
00:03:00,600 --> 00:03:03,466
And the best way -- so actually
this is what I'm here to

68
00:03:03,467 --> 00:03:05,066
introduce today.

69
00:03:05,066 --> 00:03:08,333
Data Market Energy, your
portal to the world of energy.

70
00:03:08,333 --> 00:03:10,867
And the best way to understand
what it's all about is actually

71
00:03:10,867 --> 00:03:12,132
to see a little demo.

72
00:03:12,133 --> 00:03:13,300
So we have a video of that.

73
00:03:35,867 --> 00:03:37,299
Here we go.

74
00:03:37,300 --> 00:03:40,066
So, yeah, the front page gives
you a little bit of an overview

75
00:03:40,066 --> 00:03:41,967
of what the data --
the data that is there.

76
00:03:41,967 --> 00:03:44,600
But the best way to find the
data is actually just to type in

77
00:03:44,600 --> 00:03:46,166
a query as you would in Google.

78
00:03:46,166 --> 00:03:48,033
It searches through
all this data.

79
00:03:48,033 --> 00:03:50,399
So for renewable capacity,
we have nine matches.

80
00:03:50,400 --> 00:03:52,633
The matches look a little bit
like Google search results,

81
00:03:52,633 --> 00:03:54,767
we have a title, we
see the data provider,

82
00:03:54,767 --> 00:03:57,367
and we see the time span
covered by this data.

83
00:03:57,367 --> 00:03:59,767
But what happens if you
click one of the results,

84
00:03:59,767 --> 00:04:02,200
it's very different from
what happens in Google.

85
00:04:02,200 --> 00:04:04,466
Instead of taking you off
to a third-party website,

86
00:04:04,467 --> 00:04:06,066
clicking this data
set, for example,

87
00:04:06,066 --> 00:04:08,433
it will immediately show
you a view of the data,

88
00:04:08,433 --> 00:04:11,299
showing how renewable energy
capacity has developed

89
00:04:11,300 --> 00:04:12,400
over time.

90
00:04:12,400 --> 00:04:15,900
And on the left-hand side, we
have the rest of the data that's

91
00:04:15,900 --> 00:04:17,500
available in this
single data set.

92
00:04:17,500 --> 00:04:21,100
So here we're breaking down the
renewable data sources to see

93
00:04:21,100 --> 00:04:22,467
the composition.

94
00:04:22,467 --> 00:04:26,332
And when we hit visualize,
we see that it's wind that's

95
00:04:26,333 --> 00:04:28,066
actually kicking in
in the last few --

96
00:04:28,066 --> 00:04:29,633
last decade or so there.

97
00:04:29,633 --> 00:04:32,299
We can change the data view, so
we can stack them up to see what

98
00:04:32,300 --> 00:04:33,700
they look like when
they're stacked up,

99
00:04:33,700 --> 00:04:35,866
or we can switch to a
bar chart, for example,

100
00:04:35,867 --> 00:04:38,433
to show the competition
in a single year.

101
00:04:38,433 --> 00:04:41,599
The time slide there underneath
the graph enables you to pick

102
00:04:41,600 --> 00:04:43,800
any time span or time
point that you want.

103
00:04:43,800 --> 00:04:48,000
And we see that back in 1987,
conventional hydroelectric power

104
00:04:48,000 --> 00:04:50,633
was pretty much the only
renewable data source out there.

105
00:04:50,633 --> 00:04:52,133
So a lot has happened since.

106
00:04:52,133 --> 00:04:53,467
So back to the slides, please.

107
00:04:56,400 --> 00:04:58,233
So what else might we find here?

108
00:04:58,233 --> 00:05:01,867
We might find gasoline
prices in Massachusetts,

109
00:05:01,867 --> 00:05:05,900
we might find primary -- so
basically energy production by

110
00:05:05,900 --> 00:05:10,166
source, blue is fossil fuels,
yellow is renewable energy,

111
00:05:10,166 --> 00:05:11,567
still some work to do there.

112
00:05:11,567 --> 00:05:15,333
This data and the data set
before it comes from the EIA.

113
00:05:15,333 --> 00:05:17,700
From Oak Ridge
National Laboratory,

114
00:05:17,700 --> 00:05:20,133
we have car registrations
from all over the world.

115
00:05:20,133 --> 00:05:22,000
And there are a few
other things in there.

116
00:05:22,000 --> 00:05:25,000
All in all, we already have
about 2 million times series

117
00:05:25,000 --> 00:05:28,633
about energy data -- of energy
data available in Data Market,

118
00:05:28,633 --> 00:05:31,799
coming from about 10,000
data sources across 13 data

119
00:05:31,800 --> 00:05:35,367
providers, and they hold about
50 million facts about the world

120
00:05:35,367 --> 00:05:38,633
of energy, all ready
to search, visualize,

121
00:05:38,633 --> 00:05:42,066
compare and download in your
favorite data format, mind you,

122
00:05:42,066 --> 00:05:44,500
from Data Market Energy.

123
00:05:44,500 --> 00:05:48,100
Regardless of the origin,
whether it was an Excel sheet

124
00:05:48,100 --> 00:05:51,834
from the EPA's website or
data from the new and great

125
00:05:51,834 --> 00:05:56,367
electricity price API that the
EIA just launched whose data we

126
00:05:56,367 --> 00:05:58,066
mostly have in our
system already,

127
00:05:58,066 --> 00:05:59,933
it was launched a week ago.

128
00:05:59,934 --> 00:06:05,200
And this is actually new
business for us enabled by open

129
00:06:05,200 --> 00:06:08,533
data from organizations such as
these that we have seen here.

130
00:06:08,533 --> 00:06:12,233
It's new business for us in
taking the data that's already

131
00:06:12,233 --> 00:06:14,533
there, we're not taking anything
away from the value that has

132
00:06:14,533 --> 00:06:17,266
already been released
by the data --

133
00:06:17,266 --> 00:06:19,166
by putting the data out there.

134
00:06:19,166 --> 00:06:20,166
Quite the contrary.

135
00:06:20,166 --> 00:06:23,133
We're adding value on top of
it and making that available to

136
00:06:23,133 --> 00:06:25,366
those that are willing
to pay for the increased

137
00:06:25,367 --> 00:06:28,967
discoverability, productivity
and usefulness of the data once

138
00:06:28,967 --> 00:06:30,099
it's in the system.

139
00:06:30,100 --> 00:06:33,433
So again, new business
enabled by open data.

140
00:06:33,433 --> 00:06:37,467
We're launching this later --
commercially later this month.

141
00:06:37,467 --> 00:06:39,032
Data Market Energy
will be available on

142
00:06:39,033 --> 00:06:42,266
Energy.DataMarket.com, and you
can actually go there today to

143
00:06:42,266 --> 00:06:45,133
read more about the service
and sign up for a free trial as

144
00:06:45,133 --> 00:06:46,066
we go live.

145
00:06:46,066 --> 00:06:47,467
Thank you very much.

146
00:06:47,467 --> 00:06:49,800
(applause)