You are hereWhite House Energy Datapalooza: Data Market
White House Energy Datapalooza: Data Market
Data Market helps business users find and understand data, and helps data providers efficiently publish their data and reach new audiences.
Transcript with time markers:
1
00:00:00,200 --> 00:00:04,100
Speaker:
So open data, a lot of fantastic
data has been opened up by the
2
00:00:04,100 --> 00:00:06,734
public sector all over the
world over the course of the
3
00:00:06,734 --> 00:00:08,967
last few years.
4
00:00:08,967 --> 00:00:13,834
And when my slides come up, you
will actually see a blue circle
5
00:00:13,834 --> 00:00:16,533
that represents all of the
great open data that has been
6
00:00:16,533 --> 00:00:17,533
opened up.
7
00:00:17,533 --> 00:00:19,567
There's actually more of
it out than most of us --
8
00:00:19,567 --> 00:00:20,934
most of us realize.
9
00:00:20,934 --> 00:00:24,900
And what we've seen so far in
terms of using that data is that
10
00:00:24,900 --> 00:00:28,834
we've seen entrepreneurs, clever
developers and designers take
11
00:00:28,834 --> 00:00:32,967
small parts of that data
and make apps out of it.
12
00:00:32,967 --> 00:00:36,600
You know, and making that data
super useful, super engaging,
13
00:00:36,600 --> 00:00:38,033
super easy to use.
14
00:00:38,033 --> 00:00:40,700
And there are actually quite a
lot of these applications out
15
00:00:40,700 --> 00:00:44,166
there, each taking their own
little pocket of the data that
16
00:00:44,166 --> 00:00:45,400
has been opened up.
17
00:00:45,400 --> 00:00:47,567
So here we have the slides.
18
00:00:47,567 --> 00:00:50,867
Here is the world of open data,
here's the data taken by a
19
00:00:50,867 --> 00:00:54,065
single app and made
useful to us, you know,
20
00:00:54,066 --> 00:00:56,166
in an individual way,
and that's fantastic.
21
00:00:56,166 --> 00:00:58,199
And there are quite a lot
of those apps out there,
22
00:00:58,200 --> 00:01:00,300
each taking their own
little pocket of open --
23
00:01:00,300 --> 00:01:03,867
of the world of open data and
making it easy for us to use,
24
00:01:03,867 --> 00:01:06,899
super useful, engaging,
and even fun to use,
25
00:01:06,900 --> 00:01:08,100
as we've seen this morning.
26
00:01:08,100 --> 00:01:09,533
And that's great.
27
00:01:09,533 --> 00:01:12,066
But -- and the fact of the
matter is the bulk of this data
28
00:01:12,066 --> 00:01:15,934
sets that known to few
and used by even fewer,
29
00:01:15,934 --> 00:01:19,333
it just lies there waiting
for -- waiting to be --
30
00:01:19,333 --> 00:01:21,533
waiting to be used
and turned into value.
31
00:01:21,533 --> 00:01:25,233
Let's take the world of
energy data as an example.
32
00:01:25,233 --> 00:01:29,266
So when Taut [phonetic] and his
group kicked this off back in
33
00:01:29,266 --> 00:01:32,500
May, they pointed us to a
lot of different open energy
34
00:01:32,500 --> 00:01:34,066
data sources.
35
00:01:34,066 --> 00:01:36,333
We have the EPA's website,
they have their own system,
36
00:01:36,333 --> 00:01:37,967
a lot of interesting data there.
37
00:01:37,967 --> 00:01:41,000
The EIA has several different
systems on their website,
38
00:01:41,000 --> 00:01:43,667
the data is in different
places and so on.
39
00:01:43,667 --> 00:01:45,800
We have Department of Energy.
40
00:01:45,800 --> 00:01:49,433
We have the National
Renewable Energy Laboratory.
41
00:01:49,433 --> 00:01:51,533
We have climate change data.
42
00:01:51,533 --> 00:01:53,800
We have more from the
DOE, a different system.
43
00:01:53,800 --> 00:01:56,166
We have IPCC data, and
so on and so forth.
44
00:01:56,166 --> 00:01:59,433
And these are just a few of
the sources that are out there.
45
00:01:59,433 --> 00:02:02,000
And not only are they
in different places,
46
00:02:02,000 --> 00:02:03,367
they're also in
different formats.
47
00:02:03,367 --> 00:02:06,500
We'll have an actual sheet here,
a PowerPoint there, a PDF there,
48
00:02:06,500 --> 00:02:10,667
there's SPSS files, CSV files,
and a different XML format,
49
00:02:10,667 --> 00:02:12,033
and so on and so forth.
50
00:02:12,033 --> 00:02:15,000
Let alone all the proprietary
systems that have their own data
51
00:02:15,000 --> 00:02:17,200
models and their
own APIs and so on.
52
00:02:17,200 --> 00:02:20,200
So not only is the data
heterogeneous in location,
53
00:02:20,200 --> 00:02:22,399
it's also heterogeneous
in format.
54
00:02:22,400 --> 00:02:25,033
And this is actually quite
familiar to us at Data Market.
55
00:02:25,033 --> 00:02:28,566
What we do, what our technology
does is it takes heterogeneous
56
00:02:28,567 --> 00:02:31,700
data from heterogeneous
data sources, reads it in,
57
00:02:31,700 --> 00:02:35,333
normalizes it, and stores it
in our systems in a single data
58
00:02:35,333 --> 00:02:39,367
model, making it all super
easy to find and understand.
59
00:02:39,367 --> 00:02:42,899
So what we've done here is that
we've taken data from these
60
00:02:42,900 --> 00:02:45,433
great providers of
energy, energy data,
61
00:02:45,433 --> 00:02:48,400
we've taken all the quantitative
data that we could find,
62
00:02:48,400 --> 00:02:50,767
time series data,
survey data and so on,
63
00:02:50,767 --> 00:02:53,633
and made it all available
through a single portal.
64
00:02:53,633 --> 00:02:56,567
And in addition to some
of the U.S. sources,
65
00:02:56,567 --> 00:02:59,400
we've also added some of the
international open sources that
66
00:02:59,400 --> 00:03:00,600
are out there.
67
00:03:00,600 --> 00:03:03,466
And the best way -- so actually
this is what I'm here to
68
00:03:03,467 --> 00:03:05,066
introduce today.
69
00:03:05,066 --> 00:03:08,333
Data Market Energy, your
portal to the world of energy.
70
00:03:08,333 --> 00:03:10,867
And the best way to understand
what it's all about is actually
71
00:03:10,867 --> 00:03:12,132
to see a little demo.
72
00:03:12,133 --> 00:03:13,300
So we have a video of that.
73
00:03:35,867 --> 00:03:37,299
Here we go.
74
00:03:37,300 --> 00:03:40,066
So, yeah, the front page gives
you a little bit of an overview
75
00:03:40,066 --> 00:03:41,967
of what the data --
the data that is there.
76
00:03:41,967 --> 00:03:44,600
But the best way to find the
data is actually just to type in
77
00:03:44,600 --> 00:03:46,166
a query as you would in Google.
78
00:03:46,166 --> 00:03:48,033
It searches through
all this data.
79
00:03:48,033 --> 00:03:50,399
So for renewable capacity,
we have nine matches.
80
00:03:50,400 --> 00:03:52,633
The matches look a little bit
like Google search results,
81
00:03:52,633 --> 00:03:54,767
we have a title, we
see the data provider,
82
00:03:54,767 --> 00:03:57,367
and we see the time span
covered by this data.
83
00:03:57,367 --> 00:03:59,767
But what happens if you
click one of the results,
84
00:03:59,767 --> 00:04:02,200
it's very different from
what happens in Google.
85
00:04:02,200 --> 00:04:04,466
Instead of taking you off
to a third-party website,
86
00:04:04,467 --> 00:04:06,066
clicking this data
set, for example,
87
00:04:06,066 --> 00:04:08,433
it will immediately show
you a view of the data,
88
00:04:08,433 --> 00:04:11,299
showing how renewable energy
capacity has developed
89
00:04:11,300 --> 00:04:12,400
over time.
90
00:04:12,400 --> 00:04:15,900
And on the left-hand side, we
have the rest of the data that's
91
00:04:15,900 --> 00:04:17,500
available in this
single data set.
92
00:04:17,500 --> 00:04:21,100
So here we're breaking down the
renewable data sources to see
93
00:04:21,100 --> 00:04:22,467
the composition.
94
00:04:22,467 --> 00:04:26,332
And when we hit visualize,
we see that it's wind that's
95
00:04:26,333 --> 00:04:28,066
actually kicking in
in the last few --
96
00:04:28,066 --> 00:04:29,633
last decade or so there.
97
00:04:29,633 --> 00:04:32,299
We can change the data view, so
we can stack them up to see what
98
00:04:32,300 --> 00:04:33,700
they look like when
they're stacked up,
99
00:04:33,700 --> 00:04:35,866
or we can switch to a
bar chart, for example,
100
00:04:35,867 --> 00:04:38,433
to show the competition
in a single year.
101
00:04:38,433 --> 00:04:41,599
The time slide there underneath
the graph enables you to pick
102
00:04:41,600 --> 00:04:43,800
any time span or time
point that you want.
103
00:04:43,800 --> 00:04:48,000
And we see that back in 1987,
conventional hydroelectric power
104
00:04:48,000 --> 00:04:50,633
was pretty much the only
renewable data source out there.
105
00:04:50,633 --> 00:04:52,133
So a lot has happened since.
106
00:04:52,133 --> 00:04:53,467
So back to the slides, please.
107
00:04:56,400 --> 00:04:58,233
So what else might we find here?
108
00:04:58,233 --> 00:05:01,867
We might find gasoline
prices in Massachusetts,
109
00:05:01,867 --> 00:05:05,900
we might find primary -- so
basically energy production by
110
00:05:05,900 --> 00:05:10,166
source, blue is fossil fuels,
yellow is renewable energy,
111
00:05:10,166 --> 00:05:11,567
still some work to do there.
112
00:05:11,567 --> 00:05:15,333
This data and the data set
before it comes from the EIA.
113
00:05:15,333 --> 00:05:17,700
From Oak Ridge
National Laboratory,
114
00:05:17,700 --> 00:05:20,133
we have car registrations
from all over the world.
115
00:05:20,133 --> 00:05:22,000
And there are a few
other things in there.
116
00:05:22,000 --> 00:05:25,000
All in all, we already have
about 2 million times series
117
00:05:25,000 --> 00:05:28,633
about energy data -- of energy
data available in Data Market,
118
00:05:28,633 --> 00:05:31,799
coming from about 10,000
data sources across 13 data
119
00:05:31,800 --> 00:05:35,367
providers, and they hold about
50 million facts about the world
120
00:05:35,367 --> 00:05:38,633
of energy, all ready
to search, visualize,
121
00:05:38,633 --> 00:05:42,066
compare and download in your
favorite data format, mind you,
122
00:05:42,066 --> 00:05:44,500
from Data Market Energy.
123
00:05:44,500 --> 00:05:48,100
Regardless of the origin,
whether it was an Excel sheet
124
00:05:48,100 --> 00:05:51,834
from the EPA's website or
data from the new and great
125
00:05:51,834 --> 00:05:56,367
electricity price API that the
EIA just launched whose data we
126
00:05:56,367 --> 00:05:58,066
mostly have in our
system already,
127
00:05:58,066 --> 00:05:59,933
it was launched a week ago.
128
00:05:59,934 --> 00:06:05,200
And this is actually new
business for us enabled by open
129
00:06:05,200 --> 00:06:08,533
data from organizations such as
these that we have seen here.
130
00:06:08,533 --> 00:06:12,233
It's new business for us in
taking the data that's already
131
00:06:12,233 --> 00:06:14,533
there, we're not taking anything
away from the value that has
132
00:06:14,533 --> 00:06:17,266
already been released
by the data --
133
00:06:17,266 --> 00:06:19,166
by putting the data out there.
134
00:06:19,166 --> 00:06:20,166
Quite the contrary.
135
00:06:20,166 --> 00:06:23,133
We're adding value on top of
it and making that available to
136
00:06:23,133 --> 00:06:25,366
those that are willing
to pay for the increased
137
00:06:25,367 --> 00:06:28,967
discoverability, productivity
and usefulness of the data once
138
00:06:28,967 --> 00:06:30,099
it's in the system.
139
00:06:30,100 --> 00:06:33,433
So again, new business
enabled by open data.
140
00:06:33,433 --> 00:06:37,467
We're launching this later --
commercially later this month.
141
00:06:37,467 --> 00:06:39,032
Data Market Energy
will be available on
142
00:06:39,033 --> 00:06:42,266
Energy.DataMarket.com, and you
can actually go there today to
143
00:06:42,266 --> 00:06:45,133
read more about the service
and sign up for a free trial as
144
00:06:45,133 --> 00:06:46,066
we go live.
145
00:06:46,066 --> 00:06:47,467
Thank you very much.
146
00:06:47,467 --> 00:06:49,800
(applause)