Popular New Releases in Stream Processing
webtorrent
v1.8.14
aria2
aria2 1.36.0
webtorrent-desktop
v0.24.0
Jackett
v0.20.933
popcorn-desktop
v0.4.7
Popular Libraries in Stream Processing
by gulpjs javascript
32240 MIT
A toolkit to automate & enhance your workflow
by webtorrent javascript
26250 MIT
⚡️ Streaming torrent client for the web
by aria2 c++
26036 NOASSERTION
aria2 is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.
by HelloZeroNet javascript
16995 NOASSERTION
ZeroNet - Decentralized websites using Bitcoin crypto and BitTorrent network
by qbittorrent c++
14553 NOASSERTION
qBittorrent BitTorrent client
by arut c
11056 BSD-2-Clause
NGINX-based Media Streaming Server
by webtorrent javascript
8773 MIT
❤️ Streaming torrent app for Mac, Windows, and Linux
by Jackett csharp
7299 GPL-2.0
API Support for your favorite torrent trackers
by Sonarr csharp
7123 NOASSERTION
Smart PVR for newsgroup and bittorrent users.
Trending New libraries in Stream Processing
by varbhat go
1428 GPL-3.0
Easy to Use Torrent Client. Can be hosted in Cloud. Files can be streamed in Browser/Media Player.
by iam4x typescript
955 MIT
🍿 The all-in-one alternative for Sonarr, Radarr, Jackett... with a VPN and running in docker
by Monibuca go
542 MIT
Monibuca 核心引擎,包含流媒体核心转发逻辑,需要配合功能插件一起组合运行
by dominikbraun go
487 Apache-2.0
timetrace is a simple CLI for tracking your working time.
by yaronzz csharp
456 Apache-2.0
Download 'TIDAL' Music On Windows/Linux/MacOs (PYTHON/C#)
by Mr-Un1k0d3r c
328
Red Team C code repo
by FinotiLucas typescript
320 Apache-2.0
Módulo completo consultar informações sobre o CEP, calcular o preço e os prazos das entregas das encomendas e também realizar o rastreio de multiplos produtos !
by LGouellec csharp
256 MIT
.NET Stream Processing Library for Apache Kafka 🚀
by mandreyel rust
240
A BitTorrent V1 engine library for Rust (and currently Linux)
Top Authors in Stream Processing
1
28 Libraries
13067
2
24 Libraries
43720
3
15 Libraries
87
4
11 Libraries
1659
5
11 Libraries
118
6
11 Libraries
159
7
10 Libraries
24
8
10 Libraries
99
9
9 Libraries
582
10
9 Libraries
19
1
28 Libraries
13067
2
24 Libraries
43720
3
15 Libraries
87
4
11 Libraries
1659
5
11 Libraries
118
6
11 Libraries
159
7
10 Libraries
24
8
10 Libraries
99
9
9 Libraries
582
10
9 Libraries
19
Trending Kits in Stream Processing
Developers widely use Python Stream processing to query ongoing data streams and respond to important events in timeframes ranging from milliseconds to minutes. Complex event processing, Real-time analytics, and streaming analytics are all closely linked to stream processing, which is now the preliminary framework for executing these use cases.
Stream processing engines are runtime libraries that permit coders to write code to process streaming data with not having to deal with low-level streaming mechanics. Data were traditionally processed in batches based on a schedule or predefined point (for instance, each night at 1 am, every hundred rows, or every time the volume reached two megabytes). However, as data volumes and speeds have increased, more than batch processing is needed for many applications. Python Stream processing has evolved into a must-have feature for modern applications. For various use cases and applications, enterprises have turned to technologies that respond to data as it is created. Stream processing enables applications to respond to new data events as they happen. Unlike batch processing, which groups data and collects it at predetermined intervals, stream processing applications collect and process data when it is generated.
Python Stream processing is most commonly used with data generated as a series of events, such as IoT sensor data, payment processing systems, servers, and application logs. The two common paradigms are publisher/subscriber (also known as pub/sub) and source/sink. A publisher or source generates data and events, which are then delivered to a stream processing application. Here the data might be augmented, tested against fraud detection algorithms, or otherwise transformed before being sent to a subscriber or sink. Furthermore, all major cloud services, such as Tensorflow, Numpy, and Pytorch, have native services that simplify stream processing development on their respective platforms.
Check out the list below to find more popular Python stream-processing libraries for your applications:
Trending Discussions on Stream Processing
Unhandled error Error: Data cannot be encoded in JSON error at firebase serverless functions
Flink missing windows generated on some partitions
Do I need a JAR file to run a Flink application?
Calling Hibernate in Spring cloud Stream
Filtering in Kafka and other streaming technologies
Pexpect Multi-Threading Idle State
Apache Flink - how to stop and resume stream processing on downstream failure
Hazelcast IMDG vs Hazelcast Jet config
What is benefit of using Kafka streams?
How to filter data in Kafka?
QUESTION
Unhandled error Error: Data cannot be encoded in JSON error at firebase serverless functions
Asked 2022-Apr-04 at 23:15I'm trying to deploy an api for my application. Using these codes raises Unhandled error "Error: Data cannot be encoded in JSON.
1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4 functions.logger.log(d["name"]);
5 cname = d["name"];
6 ts1=d["ts1"];
7 ts2=d["ts2"];
8 const data = await axios.get(
9 "https://api.coingecko.com/api/v3/coins/" +
10 cname +
11 "/market_chart/range?vs_currency=usd&from=" +
12 ts1 +
13 "&to=" +
14 ts2,
15 );
16 functions.logger.log(data);
17 return {data: data};
18});
19
The error log is
1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4 functions.logger.log(d["name"]);
5 cname = d["name"];
6 ts1=d["ts1"];
7 ts2=d["ts2"];
8 const data = await axios.get(
9 "https://api.coingecko.com/api/v3/coins/" +
10 cname +
11 "/market_chart/range?vs_currency=usd&from=" +
12 ts1 +
13 "&to=" +
14 ts2,
15 );
16 functions.logger.log(data);
17 return {data: data};
18});
19Unhandled error Error: Data cannot be encoded in JSON: function httpAdapter(config) {
20 return new Promise(function dispatchHttpRequest(resolvePromise, rejectPromise) {
21 var onCanceled;
22 function done() {
23 if (config.cancelToken) {
24 config.cancelToken.unsubscribe(onCanceled);
25 }
26
27 if (config.signal) {
28 config.signal.removeEventListener('abort', onCanceled);
29 }
30 }
31 var resolve = function resolve(value) {
32 done();
33 resolvePromise(value);
34 };
35 var rejected = false;
36 var reject = function reject(value) {
37 done();
38 rejected = true;
39 rejectPromise(value);
40 };
41 var data = config.data;
42 var headers = config.headers;
43 var headerNames = {};
44
45 Object.keys(headers).forEach(function storeLowerName(name) {
46 headerNames[name.toLowerCase()] = name;
47 });
48
49 // Set User-Agent (required by some servers)
50 // See https://github.com/axios/axios/issues/69
51 if ('user-agent' in headerNames) {
52 // User-Agent is specified; handle case where no UA header is desired
53 if (!headers[headerNames['user-agent']]) {
54 delete headers[headerNames['user-agent']];
55 }
56 // Otherwise, use specified value
57 } else {
58 // Only set header if it hasn't been set in config
59 headers['User-Agent'] = 'axios/' + VERSION;
60 }
61
62 if (data && !utils.isStream(data)) {
63 if (Buffer.isBuffer(data)) {
64 // Nothing to do...
65 } else if (utils.isArrayBuffer(data)) {
66 data = Buffer.from(new Uint8Array(data));
67 } else if (utils.isString(data)) {
68 data = Buffer.from(data, 'utf-8');
69 } else {
70 return reject(createError(
71 'Data after transformation must be a string, an ArrayBuffer, a Buffer, or a Stream',
72 config
73 ));
74 }
75
76 if (config.maxBodyLength > -1 && data.length > config.maxBodyLength) {
77 return reject(createError('Request body larger than maxBodyLength limit', config));
78 }
79
80 // Add Content-Length header if data exists
81 if (!headerNames['content-length']) {
82 headers['Content-Length'] = data.length;
83 }
84 }
85
86 // HTTP basic authentication
87 var auth = undefined;
88 if (config.auth) {
89 var username = config.auth.username || '';
90 var password = config.auth.password || '';
91 auth = username + ':' + password;
92 }
93
94 // Parse url
95 var fullPath = buildFullPath(config.baseURL, config.url);
96 var parsed = url.parse(fullPath);
97 var protocol = parsed.protocol || 'http:';
98
99 if (!auth && parsed.auth) {
100 var urlAuth = parsed.auth.split(':');
101 var urlUsername = urlAuth[0] || '';
102 var urlPassword = urlAuth[1] || '';
103 auth = urlUsername + ':' + urlPassword;
104 }
105
106 if (auth && headerNames.authorization) {
107 delete headers[headerNames.authorization];
108 }
109
110 var isHttpsRequest = isHttps.test(protocol);
111 var agent = isHttpsRequest ? config.httpsAgent : config.httpAgent;
112
113 var options = {
114 path: buildURL(parsed.path, config.params, config.paramsSerializer).replace(/^\?/, ''),
115 method: config.method.toUpperCase(),
116 headers: headers,
117 agent: agent,
118 agents: { http: config.httpAgent, https: config.httpsAgent },
119 auth: auth
120 };
121
122 if (config.socketPath) {
123 options.socketPath = config.socketPath;
124 } else {
125 options.hostname = parsed.hostname;
126 options.port = parsed.port;
127 }
128
129 var proxy = config.proxy;
130 if (!proxy && proxy !== false) {
131 var proxyEnv = protocol.slice(0, -1) + '_proxy';
132 var proxyUrl = process.env[proxyEnv] || process.env[proxyEnv.toUpperCase()];
133 if (proxyUrl) {
134 var parsedProxyUrl = url.parse(proxyUrl);
135 var noProxyEnv = process.env.no_proxy || process.env.NO_PROXY;
136 var shouldProxy = true;
137
138 if (noProxyEnv) {
139 var noProxy = noProxyEnv.split(',').map(function trim(s) {
140 return s.trim();
141 });
142
143 shouldProxy = !noProxy.some(function proxyMatch(proxyElement) {
144 if (!proxyElement) {
145 return false;
146 }
147 if (proxyElement === '*') {
148 return true;
149 }
150 if (proxyElement[0] === '.' &&
151 parsed.hostname.substr(parsed.hostname.length - proxyElement.length) === proxyElement) {
152 return true;
153 }
154
155 return parsed.hostname === proxyElement;
156 });
157 }
158
159 if (shouldProxy) {
160 proxy = {
161 host: parsedProxyUrl.hostname,
162 port: parsedProxyUrl.port,
163 protocol: parsedProxyUrl.protocol
164 };
165
166 if (parsedProxyUrl.auth) {
167 var proxyUrlAuth = parsedProxyUrl.auth.split(':');
168 proxy.auth = {
169 username: proxyUrlAuth[0],
170 password: proxyUrlAuth[1]
171 };
172 }
173 }
174 }
175 }
176
177 if (proxy) {
178 options.headers.host = parsed.hostname + (parsed.port ? ':' + parsed.port : '');
179 setProxy(options, proxy, protocol + '//' + parsed.hostname + (parsed.port ? ':' + parsed.port : '') + options.path);
180 }
181
182 var transport;
183 var isHttpsProxy = isHttpsRequest && (proxy ? isHttps.test(proxy.protocol) : true);
184 if (config.transport) {
185 transport = config.transport;
186 } else if (config.maxRedirects === 0) {
187 transport = isHttpsProxy ? https : http;
188 } else {
189 if (config.maxRedirects) {
190 options.maxRedirects = config.maxRedirects;
191 }
192 transport = isHttpsProxy ? httpsFollow : httpFollow;
193 }
194
195 if (config.maxBodyLength > -1) {
196 options.maxBodyLength = config.maxBodyLength;
197 }
198
199 if (config.insecureHTTPParser) {
200 options.insecureHTTPParser = config.insecureHTTPParser;
201 }
202
203 // Create the request
204 var req = transport.request(options, function handleResponse(res) {
205 if (req.aborted) return;
206
207 // uncompress the response body transparently if required
208 var stream = res;
209
210 // return the last request in case of redirects
211 var lastRequest = res.req || req;
212
213
214 // if no content, is HEAD request or decompress disabled we should not decompress
215 if (res.statusCode !== 204 && lastRequest.method !== 'HEAD' && config.decompress !== false) {
216 switch (res.headers['content-encoding']) {
217 /*eslint default-case:0*/
218 case 'gzip':
219 case 'compress':
220 case 'deflate':
221 // add the unzipper to the body stream processing pipeline
222 stream = stream.pipe(zlib.createUnzip());
223
224 // remove the content-encoding in order to not confuse downstream operations
225 delete res.headers['content-encoding'];
226 break;
227 }
228 }
229
230 var response = {
231 status: res.statusCode,
232 statusText: res.statusMessage,
233 headers: res.headers,
234 config: config,
235 request: lastRequest
236 };
237
238 if (config.responseType === 'stream') {
239 response.data = stream;
240 settle(resolve, reject, response);
241 } else {
242 var responseBuffer = [];
243 var totalResponseBytes = 0;
244 stream.on('data', function handleStreamData(chunk) {
245 responseBuffer.push(chunk);
246 totalResponseBytes += chunk.length;
247
248 // make sure the content length is not over the maxContentLength if specified
249 if (config.maxContentLength > -1 && totalResponseBytes > config.maxContentLength) {
250 // stream.destoy() emit aborted event before calling reject() on Node.js v16
251 rejected = true;
252 stream.destroy();
253 reject(createError('maxContentLength size of ' + config.maxContentLength + ' exceeded',
254 config, null, lastRequest));
255 }
256 });
257
258 stream.on('aborted', function handlerStreamAborted() {
259 if (rejected) {
260 return;
261 }
262 stream.destroy();
263 reject(createError('error request aborted', config, 'ERR_REQUEST_ABORTED', lastRequest));
264 });
265
266 stream.on('error', function handleStreamError(err) {
267 if (req.aborted) return;
268 reject(enhanceError(err, config, null, lastRequest));
269 });
270
271 stream.on('end', function handleStreamEnd() {
272 try {
273 var responseData = responseBuffer.length === 1 ? responseBuffer[0] : Buffer.concat(responseBuffer);
274 if (config.responseType !== 'arraybuffer') {
275 responseData = responseData.toString(config.responseEncoding);
276 if (!config.responseEncoding || config.responseEncoding === 'utf8') {
277 responseData = utils.stripBOM(responseData);
278 }
279 }
280 response.data = responseData;
281 } catch (err) {
282 reject(enhanceError(err, config, err.code, response.request, response));
283 }
284 settle(resolve, reject, response);
285 });
286 }
287 });
288
289 // Handle errors
290 req.on('error', function handleRequestError(err) {
291 if (req.aborted && err.code !== 'ERR_FR_TOO_MANY_REDIRECTS') return;
292 reject(enhanceError(err, config, null, req));
293 });
294
295 // set tcp keep alive to prevent drop connection by peer
296 req.on('socket', function handleRequestSocket(socket) {
297 // default interval of sending ack packet is 1 minute
298 socket.setKeepAlive(true, 1000 * 60);
299 });
300
301 // Handle request timeout
302 if (config.timeout) {
303 // This is forcing a int timeout to avoid problems if the `req` interface doesn't handle other types.
304 var timeout = parseInt(config.timeout, 10);
305
306 if (isNaN(timeout)) {
307 reject(createError(
308 'error trying to parse `config.timeout` to int',
309 config,
310 'ERR_PARSE_TIMEOUT',
311 req
312 ));
313
314 return;
315 }
316
317 // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
318 // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
319 // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
320 // And then these socket which be hang up will devoring CPU little by little.
321 // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
322 req.setTimeout(timeout, function handleRequestTimeout() {
323 req.abort();
324 var transitional = config.transitional || defaults.transitional;
325 reject(createError(
326 'timeout of ' + timeout + 'ms exceeded',
327 config,
328 transitional.clarifyTimeoutError ? 'ETIMEDOUT' : 'ECONNABORTED',
329 req
330 ));
331 });
332 }
333
334 if (config.cancelToken || config.signal) {
335 // Handle cancellation
336 // eslint-disable-next-line func-names
337 onCanceled = function(cancel) {
338 if (req.aborted) return;
339
340 req.abort();
341 reject(!cancel || (cancel && cancel.type) ? new Cancel('canceled') : cancel);
342 };
343
344 config.cancelToken && config.cancelToken.subscribe(onCanceled);
345 if (config.signal) {
346 config.signal.aborted ? onCanceled() : config.signal.addEventListener('abort', onCanceled);
347 }
348 }
349
350
351 // Send the request
352 if (utils.isStream(data)) {
353 data.on('error', function handleStreamError(err) {
354 reject(enhanceError(err, config, null, req));
355 }).pipe(req);
356 } else {
357 req.end(data);
358 }
359 });
360}
361 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:162:11)
362 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
363 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
364 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
365 at /workspace/node_modules/firebase-functions/lib/common/providers/https.js:334:22
366 at processTicksAndRejections (internal/process/task_queues.js:97:5)
367
First logger logs the parameter i gave correctly and the logger that logs data is in this format:
1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4 functions.logger.log(d["name"]);
5 cname = d["name"];
6 ts1=d["ts1"];
7 ts2=d["ts2"];
8 const data = await axios.get(
9 "https://api.coingecko.com/api/v3/coins/" +
10 cname +
11 "/market_chart/range?vs_currency=usd&from=" +
12 ts1 +
13 "&to=" +
14 ts2,
15 );
16 functions.logger.log(data);
17 return {data: data};
18});
19Unhandled error Error: Data cannot be encoded in JSON: function httpAdapter(config) {
20 return new Promise(function dispatchHttpRequest(resolvePromise, rejectPromise) {
21 var onCanceled;
22 function done() {
23 if (config.cancelToken) {
24 config.cancelToken.unsubscribe(onCanceled);
25 }
26
27 if (config.signal) {
28 config.signal.removeEventListener('abort', onCanceled);
29 }
30 }
31 var resolve = function resolve(value) {
32 done();
33 resolvePromise(value);
34 };
35 var rejected = false;
36 var reject = function reject(value) {
37 done();
38 rejected = true;
39 rejectPromise(value);
40 };
41 var data = config.data;
42 var headers = config.headers;
43 var headerNames = {};
44
45 Object.keys(headers).forEach(function storeLowerName(name) {
46 headerNames[name.toLowerCase()] = name;
47 });
48
49 // Set User-Agent (required by some servers)
50 // See https://github.com/axios/axios/issues/69
51 if ('user-agent' in headerNames) {
52 // User-Agent is specified; handle case where no UA header is desired
53 if (!headers[headerNames['user-agent']]) {
54 delete headers[headerNames['user-agent']];
55 }
56 // Otherwise, use specified value
57 } else {
58 // Only set header if it hasn't been set in config
59 headers['User-Agent'] = 'axios/' + VERSION;
60 }
61
62 if (data && !utils.isStream(data)) {
63 if (Buffer.isBuffer(data)) {
64 // Nothing to do...
65 } else if (utils.isArrayBuffer(data)) {
66 data = Buffer.from(new Uint8Array(data));
67 } else if (utils.isString(data)) {
68 data = Buffer.from(data, 'utf-8');
69 } else {
70 return reject(createError(
71 'Data after transformation must be a string, an ArrayBuffer, a Buffer, or a Stream',
72 config
73 ));
74 }
75
76 if (config.maxBodyLength > -1 && data.length > config.maxBodyLength) {
77 return reject(createError('Request body larger than maxBodyLength limit', config));
78 }
79
80 // Add Content-Length header if data exists
81 if (!headerNames['content-length']) {
82 headers['Content-Length'] = data.length;
83 }
84 }
85
86 // HTTP basic authentication
87 var auth = undefined;
88 if (config.auth) {
89 var username = config.auth.username || '';
90 var password = config.auth.password || '';
91 auth = username + ':' + password;
92 }
93
94 // Parse url
95 var fullPath = buildFullPath(config.baseURL, config.url);
96 var parsed = url.parse(fullPath);
97 var protocol = parsed.protocol || 'http:';
98
99 if (!auth && parsed.auth) {
100 var urlAuth = parsed.auth.split(':');
101 var urlUsername = urlAuth[0] || '';
102 var urlPassword = urlAuth[1] || '';
103 auth = urlUsername + ':' + urlPassword;
104 }
105
106 if (auth && headerNames.authorization) {
107 delete headers[headerNames.authorization];
108 }
109
110 var isHttpsRequest = isHttps.test(protocol);
111 var agent = isHttpsRequest ? config.httpsAgent : config.httpAgent;
112
113 var options = {
114 path: buildURL(parsed.path, config.params, config.paramsSerializer).replace(/^\?/, ''),
115 method: config.method.toUpperCase(),
116 headers: headers,
117 agent: agent,
118 agents: { http: config.httpAgent, https: config.httpsAgent },
119 auth: auth
120 };
121
122 if (config.socketPath) {
123 options.socketPath = config.socketPath;
124 } else {
125 options.hostname = parsed.hostname;
126 options.port = parsed.port;
127 }
128
129 var proxy = config.proxy;
130 if (!proxy && proxy !== false) {
131 var proxyEnv = protocol.slice(0, -1) + '_proxy';
132 var proxyUrl = process.env[proxyEnv] || process.env[proxyEnv.toUpperCase()];
133 if (proxyUrl) {
134 var parsedProxyUrl = url.parse(proxyUrl);
135 var noProxyEnv = process.env.no_proxy || process.env.NO_PROXY;
136 var shouldProxy = true;
137
138 if (noProxyEnv) {
139 var noProxy = noProxyEnv.split(',').map(function trim(s) {
140 return s.trim();
141 });
142
143 shouldProxy = !noProxy.some(function proxyMatch(proxyElement) {
144 if (!proxyElement) {
145 return false;
146 }
147 if (proxyElement === '*') {
148 return true;
149 }
150 if (proxyElement[0] === '.' &&
151 parsed.hostname.substr(parsed.hostname.length - proxyElement.length) === proxyElement) {
152 return true;
153 }
154
155 return parsed.hostname === proxyElement;
156 });
157 }
158
159 if (shouldProxy) {
160 proxy = {
161 host: parsedProxyUrl.hostname,
162 port: parsedProxyUrl.port,
163 protocol: parsedProxyUrl.protocol
164 };
165
166 if (parsedProxyUrl.auth) {
167 var proxyUrlAuth = parsedProxyUrl.auth.split(':');
168 proxy.auth = {
169 username: proxyUrlAuth[0],
170 password: proxyUrlAuth[1]
171 };
172 }
173 }
174 }
175 }
176
177 if (proxy) {
178 options.headers.host = parsed.hostname + (parsed.port ? ':' + parsed.port : '');
179 setProxy(options, proxy, protocol + '//' + parsed.hostname + (parsed.port ? ':' + parsed.port : '') + options.path);
180 }
181
182 var transport;
183 var isHttpsProxy = isHttpsRequest && (proxy ? isHttps.test(proxy.protocol) : true);
184 if (config.transport) {
185 transport = config.transport;
186 } else if (config.maxRedirects === 0) {
187 transport = isHttpsProxy ? https : http;
188 } else {
189 if (config.maxRedirects) {
190 options.maxRedirects = config.maxRedirects;
191 }
192 transport = isHttpsProxy ? httpsFollow : httpFollow;
193 }
194
195 if (config.maxBodyLength > -1) {
196 options.maxBodyLength = config.maxBodyLength;
197 }
198
199 if (config.insecureHTTPParser) {
200 options.insecureHTTPParser = config.insecureHTTPParser;
201 }
202
203 // Create the request
204 var req = transport.request(options, function handleResponse(res) {
205 if (req.aborted) return;
206
207 // uncompress the response body transparently if required
208 var stream = res;
209
210 // return the last request in case of redirects
211 var lastRequest = res.req || req;
212
213
214 // if no content, is HEAD request or decompress disabled we should not decompress
215 if (res.statusCode !== 204 && lastRequest.method !== 'HEAD' && config.decompress !== false) {
216 switch (res.headers['content-encoding']) {
217 /*eslint default-case:0*/
218 case 'gzip':
219 case 'compress':
220 case 'deflate':
221 // add the unzipper to the body stream processing pipeline
222 stream = stream.pipe(zlib.createUnzip());
223
224 // remove the content-encoding in order to not confuse downstream operations
225 delete res.headers['content-encoding'];
226 break;
227 }
228 }
229
230 var response = {
231 status: res.statusCode,
232 statusText: res.statusMessage,
233 headers: res.headers,
234 config: config,
235 request: lastRequest
236 };
237
238 if (config.responseType === 'stream') {
239 response.data = stream;
240 settle(resolve, reject, response);
241 } else {
242 var responseBuffer = [];
243 var totalResponseBytes = 0;
244 stream.on('data', function handleStreamData(chunk) {
245 responseBuffer.push(chunk);
246 totalResponseBytes += chunk.length;
247
248 // make sure the content length is not over the maxContentLength if specified
249 if (config.maxContentLength > -1 && totalResponseBytes > config.maxContentLength) {
250 // stream.destoy() emit aborted event before calling reject() on Node.js v16
251 rejected = true;
252 stream.destroy();
253 reject(createError('maxContentLength size of ' + config.maxContentLength + ' exceeded',
254 config, null, lastRequest));
255 }
256 });
257
258 stream.on('aborted', function handlerStreamAborted() {
259 if (rejected) {
260 return;
261 }
262 stream.destroy();
263 reject(createError('error request aborted', config, 'ERR_REQUEST_ABORTED', lastRequest));
264 });
265
266 stream.on('error', function handleStreamError(err) {
267 if (req.aborted) return;
268 reject(enhanceError(err, config, null, lastRequest));
269 });
270
271 stream.on('end', function handleStreamEnd() {
272 try {
273 var responseData = responseBuffer.length === 1 ? responseBuffer[0] : Buffer.concat(responseBuffer);
274 if (config.responseType !== 'arraybuffer') {
275 responseData = responseData.toString(config.responseEncoding);
276 if (!config.responseEncoding || config.responseEncoding === 'utf8') {
277 responseData = utils.stripBOM(responseData);
278 }
279 }
280 response.data = responseData;
281 } catch (err) {
282 reject(enhanceError(err, config, err.code, response.request, response));
283 }
284 settle(resolve, reject, response);
285 });
286 }
287 });
288
289 // Handle errors
290 req.on('error', function handleRequestError(err) {
291 if (req.aborted && err.code !== 'ERR_FR_TOO_MANY_REDIRECTS') return;
292 reject(enhanceError(err, config, null, req));
293 });
294
295 // set tcp keep alive to prevent drop connection by peer
296 req.on('socket', function handleRequestSocket(socket) {
297 // default interval of sending ack packet is 1 minute
298 socket.setKeepAlive(true, 1000 * 60);
299 });
300
301 // Handle request timeout
302 if (config.timeout) {
303 // This is forcing a int timeout to avoid problems if the `req` interface doesn't handle other types.
304 var timeout = parseInt(config.timeout, 10);
305
306 if (isNaN(timeout)) {
307 reject(createError(
308 'error trying to parse `config.timeout` to int',
309 config,
310 'ERR_PARSE_TIMEOUT',
311 req
312 ));
313
314 return;
315 }
316
317 // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
318 // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
319 // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
320 // And then these socket which be hang up will devoring CPU little by little.
321 // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
322 req.setTimeout(timeout, function handleRequestTimeout() {
323 req.abort();
324 var transitional = config.transitional || defaults.transitional;
325 reject(createError(
326 'timeout of ' + timeout + 'ms exceeded',
327 config,
328 transitional.clarifyTimeoutError ? 'ETIMEDOUT' : 'ECONNABORTED',
329 req
330 ));
331 });
332 }
333
334 if (config.cancelToken || config.signal) {
335 // Handle cancellation
336 // eslint-disable-next-line func-names
337 onCanceled = function(cancel) {
338 if (req.aborted) return;
339
340 req.abort();
341 reject(!cancel || (cancel && cancel.type) ? new Cancel('canceled') : cancel);
342 };
343
344 config.cancelToken && config.cancelToken.subscribe(onCanceled);
345 if (config.signal) {
346 config.signal.aborted ? onCanceled() : config.signal.addEventListener('abort', onCanceled);
347 }
348 }
349
350
351 // Send the request
352 if (utils.isStream(data)) {
353 data.on('error', function handleStreamError(err) {
354 reject(enhanceError(err, config, null, req));
355 }).pipe(req);
356 } else {
357 req.end(data);
358 }
359 });
360}
361 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:162:11)
362 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
363 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
364 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
365 at /workspace/node_modules/firebase-functions/lib/common/providers/https.js:334:22
366 at processTicksAndRejections (internal/process/task_queues.js:97:5)
367...["api.coingecko.com:443::::::::::::::::::"]},"keepAliveMsecs":1000,"maxFreeSockets":256,"scheduling":"fifo","keepAlive":false,"maxSockets":null},"_removedConnection":false,"writable":true},"status":200,"data":{"prices":[[1615345414698,37.27069164629981],[1615349310788,36.95627388647297],[1615352802175,37.48630338203377],[1615356202751,37.46442850999597],[1615360079361,37.642735963063906],[1615363905145,38.29435586902702],[1615367492353,38.313292928237594],[1615370461299,38.75503558097479],[1615374138056,38.24406575020552],[1615377815960,38.237026584388175],[1615381321332,38.93964664468625],[1615384813000,39.262646397955635],[1615388739874,39.15882057568881],[1615392094129,38.94488140309047],[1615395966875,38.79820936257378],[1615399312625,38.51637055616189],[1615403055037,38.59237008394828],[1615406529740,38.44087305010874],[1615410281814,37.71855645797291],[1615414278815,38.374824600586976],[1615417716420,38.4538669693684],[1615421045728,37.62772478442999],[1615425672990,36.8826465121472],[1615429587089,37.41958697414903],[1615432278494,37.34865694722488],[1615435254265,37.16289143388951],[1615439122292,37.14731463575248],[1615442523394,36.801517989796814],[1615446290102,37.02248224990424],[1615450361470,36.164787531097126],[1615453299572,36.46191265162147],[1615457172317,36.174755169666334],[1615460886498,37.05778010952229],[1615464298322,37.336909500902365],[1615469586325,37.56497212211488],[1615472126260,37.83046394206218],[1615474882979,37.252561357731096],[1615478498201,36.56190097084664],[1615482336185,36.83824760787625],[1615485957910,36.89351702770813],[1615489642151,37.589229946501746],[1615493390438,37.33184737771527],[1615496666244,37.29234576242379],[1615500577712,37.284260441548426],[1616866645601,1137195941.0307472],[1616870299925,1089416195.9864128],[1616873841648,1074341877.495249],[1616877368137,1061555457.3375872],[1616880970910,1077775411.1216433],[1616884693948,1064594490.6022671],[1616887998472,1087481667.611567],[1616891397951,1068140794.5197278],[1616894759953,1078753362.1719048],[1616898371565,1053546315.1245787],[1616902002474,1052498816.7223371],[1616905584364,1026915395.5541993],[1616909101481,1022271206.3215427],[1616912730390,997185793.1210617],[1616916434482,972130048.6316774],[1616919928611,988711196.2721183],[1616923534317,987299160.6191593],[1616926264719,975360472.6011684],[1616930074136,958327264.7346151],[1616933292776,935085970.8922312],[1616936940791,896217168.3654604],[1616940936847,878876312.6707534],[1616944090304,890504985.5476977],[1616948321869,896715385.5657766],[1616952007508,870767231.0865391],[1616955544207,880601758.4610194],[1616958381375,896794852.1077055],[1616962022167,929362788.5783823],[1616966479654,927502494.4691795],[1616969648773,880385481.5284289],[1616973545649,862329007.9935848],[1616977463095,840138544.6360805],[1616980359587,849727926.595521],[1616984356096,820616225.3306137],[1616987602367,898085663.0760688],[1616990444958,890215727.4112909],[1616995470635,914823340.6343507],[1616999032159,890922230.685704],[1617002651977,937214914.0703756],[1617005329558,976030203.3879734],[1617009370471,1061898884.4388478],[1617012348377,1111994349.2592206],[1617015705482,1175310227.1595278],[1617019895549,1217044915.3900926],[1617022941451,1204239369.9336267],[1617027118715,1225123359.1178432],[1617031210170,1191418570.9198012],[1617033728601,1257085051.9742537],[1617037882992,1261291734.3667347],[1617041858553,1265805909.4506621],[1617044547418,1261869965.5784621],[1617049418534,1225924891.220988],[1617052450394,1200646247.466799],[1617055896172,1209247034.0807025],[1617059684123,1249662106.3996315],[1617062561979,837849935.5380555],[1617066155823,1261094295.2039979],[1617070572708,1244044711.3556864],[1617074210159,1178503497.252399],[1617077106612,1184744920.254339],[1617080571662,1219164970.9205332],[1617084836477,1174744890.1399443],[1617087739776,1236332180.5454476],[1617092763739,1121685108.4046226],[1617096303391,1074005978.1362224],[1617100013739,1075898891.906641],[1617102136947,1041120230.0169744],[1617106411165,1021062028.7444541],[1617110588848,1004207600.6385714],[1617114148509,983098685.435342],[1617117449987,983878432.6976557],[1617120868725,943893192.0239582],[1617123806180,948379973.8680001],[1617128347360,948328240.0510467],[1617131244094,923477307.6495335],[1617134866719,918321070.6284192],[1617138697011,960178009.2986945],[1617142067857,974105207.7725881],[1617146083923,973959760.0729104],[1617149999086,959500047.5209063],[1617153094367,1007753562.6156206],[1617156698445,1021534121.1115336],[1617160175611,1028067427.0339341],[1617163928330,1007755251.8882328],[1617166924538,1023240773.0466446],[1617171886674,1037535813.1806505],[1617175133694,1101375379.7094195],[1617178435173,1136688478.90344],[1617182857658,1208366620.2561867],[1617185353773,1208823054.3509212],[1617188828477,1234197192.568771],[1617193393471,1707076315.380663],[1617196301983,1845668637.7358408],[1617199516026,1901877634.1385415],[1617203681947,2015292037.1305778],[1617207515426,2141098631.115179],[1617210224998,2343473154.2871637],[1617214323265,2329074198.4966955],[1617217968405,2461828129.1798186],[1617221653017,2493042958.539376],[1617224582971,2532015555.7692595],[1617228589364,2508661361.110037],[1617232204720,2590057969.924583],[1617235260464,2749780924.550207],[1617239367664,2791689438.967896],[1617243152558,2778422749.5901804],[1617246573894,2802892972.2612605],[1617250114952,2795446026.902383],[1617253276300,2837092221.188881],[1617257741390,2957061611.281718],[1617261111556,3025594776.954216],[1617264301698,3140730366.12618],[1617267704421,3230797741.627739],[1617272276500,3247001347.7404704],[1617275862720,3182990384.8873067],[1617279129292,2889317168.9977646],[1617283053665,2753527702.506779],[1617287046529,2700392654.8781624],[1617290204012,2616296684.424929],[1617293298853,2494255828.9768047],[1617296557242,2383424694.8900166],[1617301325511,2288268623.177356],[1617303766777,2297155897.636895],[1617307669347,2314935325.319679],[1617311721980,2259716784.056617],[1617314946823,2267889595.9127536],[1617319572007,2174169254.528509],[1617323182318,2097690604.8152165],[1617326033792,2110975746.1916978],[1617329489226,2126100629.800452],[1617332409284,2193182655.044224],[1617337211709,2199847063.5248647],[1617340611316,2167549077.601362],[1617344146863,2110348803.8388174],[1617347361962,2023115590.5637138],[1617351380142,1864316761.5098753],[1617354151186,1788973202.0040677],[1617359277447,1731207666.0376515],[1617361312976,1418566500.3106787],[1617366169158,1693859181.5518322],[1617369860769,1656689094.290342],[1617372306072,1660176536.7450612],[1617376754676,1722154482.4234965],[1617379285817,1915067128.493045],[1617383311995,1982773491.2907202],[1617387963188,1985155493.939231],[1617391564495,1827213471.6221747],[1617395202777,1932891922.7380657],[1617398214973,1937931474.560893],[1617401809690,1961473630.4188676],[1617405699909,1952347409.661483],[1617409553080,2172811188.054834],[1617412963837,2431917537.219363],[1617416445822,2666886575.1140027],[1617420431122,2769520722.4907126],[1617422613890,2797409323.779513],[1617427393260,2895546310.6951184],[1617431058021,2894169435.883223],[1617433696700,2651591430.614699],[1617437513773,3448548871.8910036],[1617441138039,3537764498.5278754],[1617444820385,3662623380.0181885],[1617448128419,3729999481.3895626],[1617452094944,3741683833.307362],[1617457034540,3761774670.321721],[1617460631688,3809173022.555833],[1617464335978,3711591162.8519845],[1617467879738,3759143118.4621553],[1617471447610,3693936894.7524076],[1617474960418,3833857114.2069917],[1617478639837,3888109113.59996],[1617482233320,3857034438.9984646],[1617485821346,3898924734.2645984],[1617489477282,3952661186.2182713],[1617493109729,4002501827.9437523],[1617495709286,3872814933.0218143],[1617499443431,3939579930.8108554],[1617503699037,3663106636.5813146],[1617507443725,3808705623.491391],[1617510706891,3786240536.055139],[1617512446242,3717882675.3539762],[1617516040645,3722966733.2957063],[1617519813304,3482249884.952562],[1617523351916,3345586253.508183],[1617526909722,3327000473.8244348],[1617530664916,3181835266.2617188],[1617534176048,3094776290.1306324],[1617537924632,3064167829.684326],[1617541493704,3112790145.252149],[1617545018360,2989449570.670528],[1617548594506,3016965749.017692],[1617552471191,2973530338.557288],[1617555933696,2759208177.1915674],[1617559387440,2662906186.1813793],[1617563034515,2521716547.9565806],[1617566483711,2454800946.788864],[1617570325792,2412175803.4922743],[1617573668989,2381142461.766321],[1617577282876,2228904400.2017546],[1617580896737,2203439508.717633],[1617584514686,2083961834.3200803],[1617588367701,1922511436.832222],[1617591869391,1816453643.1859522],[1617595346098,1783362433.1356776],[1617599069131,1767878927.408502],[1617602711113,1782121869.0062866],[1617606278078,1784322317.8294444],[1617609891135,1785304724.1970084],[1617613319383,1792007217.4012969],[1617617302304,1808002080.6732872],[1617620901014,1821923720.87615],[1617624265084,1769426364.6123836],[1617629555312,1731155926.337212],[1617631504259,1735378701.9021676],[1617635133537,1942437073.2385755],[1617638780500,1938122743.6976163],[1617642119732,1932182393.8447528],[1617645707597,1918416705.3436842],[1617649325384,1925855235.7182896],[1617653252063,1944708214.0244768],[1617656889033,1932665022.73478],[1617660329160,1943687775.1192245],[1617663683699,1971924479.2343264],[1617667435208,2101421530.2666874],[1617672769205,2175322213.812557],[1617674524812,2168578229.7784457],[1617678186353,2149217571.1759067],[1617681915267,2132725563.885806],[1617685469475,1907950838.2268875],[1617689189705,2026223167.4473426],[1617692670953,1991840998.8517568],[1617696101989,1958389716.0448081],[1617699877898,2027665770.2623076],[1617703590445,2045913908.1590445],[1617707076556,2057724347.183567],[1617710622851,1722203248.9530182],[1617714225215,2160140597.446546],[1617717905528,2192080372.5552874],[1617721488585,2199844279.449877],[1617724918808,2244159138.5689125],[1617728548093,2263548854.897557],[1617732187891,2106855536.9938018],[1617735969816,2268365061.664965],[1617739538518,1863113060.588111],[1617742875565,2296819840.9881096],[1617746516853,2308037223.56185],[1617750327052,2297405821.9954567],[1617754017835,2215648462.217197],[1617758617023,2112353884.9607923],[1617761085616,2094123582.0260437],[1617764518134,2101292245.7045105],[1617768287923,2104106865.0792534],[1617771810289,2127056476.4717],[1617775566730,2152196953.3590703],[1617778865860,2160666464.579131],[1617782881414,2201171213.1865735],[1617786249160,2203934869.139618],[1617789807394,2329117281.806726],[1617793383957,2333039138.899913],[1617796986959,2491205752.3653517],[1617800521125,2652604590.3673797],[1617804331429,2692817000.168284],[1617807822435,2121796914.212729],[1617811418506,2538097921.330415],[1617815037057,2572049083.87979],[1617818698211,2550478468.4248347],[1617822347031,2541491737.3311806],[1617825969097,2609118564.630648],[1617829326876,2651351577.1099257],[1617833305171,2429954572.560337],[1617837011298,2435043578.3313527],[1617840572965,2394428204.082167],[1617843841041,2446826032.07983],[1617848315742,2395082349.188743],[1617850339793,2376349751.741466],[1617852591890,2385498650.2366877],[1617855126472,2380054416.699361],[1617858732962,2424505564.216302],[1617862619633,2434391633.272485],[1617865876330,2410962812.9744062],[1617869233838,2516114320.406959],[1617872539799,2437748581.3302546],[1617876293610,2247205079.171164],[1617880005259,2149347865.150653],[1617883394235,1893777066.5168178],[1617886836203,1757412804.559377],[1617892197847,1668727963.8671286],[1617894162445,1631584545.4824028],[1617897737215,1596293896.725426],[1617901282046,1525523967.3370435],[1617905003853,1370316987.26801],[1617908631874,1358993841.079183],[1617912335250,1404691449.9164236],[1617915995319,1379405950.1047523],[1617919567600,1366246502.7408085],[1617923270275,1289254721.1461022],[1617926622919,1386402238.6371279],[1617930228859,1384851642.1789908],[1617933705891,1365548610.2907162],[1617937372258,1357266138.9978309],[1617941122560,1335764096.6047564],[1617944870896,1322495289.1105938],[1617948462328,1283751933.8339043],[1617951863802,1272489837.990008],[1617955666499,1259096045.8789752],[1617958890026,1247182948.0102005],[1617962609987,1220448763.9536679],[1617966256703,1222538618.147044],[1617969964555,1148194206.4734476],[1617973333279,1199996169.7479842],[1617977646106,1154935691.529977],[1617980504476,1144564005.003322],[1617984273306,1132822242.6037295],[1617987925282,1136733019.0246003],[1617991396077,1139090847.1565342],[1617994822351,1133169530.4839995],[1617998615234,1113274570.5832539],[1618002141094,1094805189.6349592],[1618005876460,1035579604.067034],[1618009282025,1090335224.3969038],[1618013035782,1063984405.5106469],[1618016519119,1058097513.8615906],[1618020114108,1065381128.0365001]]}}
368
When this code invoked it logs data correctly but i can not return it at last. Anyone can help?
ANSWER
Answered 2022-Apr-04 at 23:14The problem appears to be that you're trying to return the entire Axios response. This cannot be serialised as JSON due to circular references.
Simply return the response data
instead. You can also make your URL construction simpler (and safer) using the params
option
1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4 functions.logger.log(d["name"]);
5 cname = d["name"];
6 ts1=d["ts1"];
7 ts2=d["ts2"];
8 const data = await axios.get(
9 "https://api.coingecko.com/api/v3/coins/" +
10 cname +
11 "/market_chart/range?vs_currency=usd&from=" +
12 ts1 +
13 "&to=" +
14 ts2,
15 );
16 functions.logger.log(data);
17 return {data: data};
18});
19Unhandled error Error: Data cannot be encoded in JSON: function httpAdapter(config) {
20 return new Promise(function dispatchHttpRequest(resolvePromise, rejectPromise) {
21 var onCanceled;
22 function done() {
23 if (config.cancelToken) {
24 config.cancelToken.unsubscribe(onCanceled);
25 }
26
27 if (config.signal) {
28 config.signal.removeEventListener('abort', onCanceled);
29 }
30 }
31 var resolve = function resolve(value) {
32 done();
33 resolvePromise(value);
34 };
35 var rejected = false;
36 var reject = function reject(value) {
37 done();
38 rejected = true;
39 rejectPromise(value);
40 };
41 var data = config.data;
42 var headers = config.headers;
43 var headerNames = {};
44
45 Object.keys(headers).forEach(function storeLowerName(name) {
46 headerNames[name.toLowerCase()] = name;
47 });
48
49 // Set User-Agent (required by some servers)
50 // See https://github.com/axios/axios/issues/69
51 if ('user-agent' in headerNames) {
52 // User-Agent is specified; handle case where no UA header is desired
53 if (!headers[headerNames['user-agent']]) {
54 delete headers[headerNames['user-agent']];
55 }
56 // Otherwise, use specified value
57 } else {
58 // Only set header if it hasn't been set in config
59 headers['User-Agent'] = 'axios/' + VERSION;
60 }
61
62 if (data && !utils.isStream(data)) {
63 if (Buffer.isBuffer(data)) {
64 // Nothing to do...
65 } else if (utils.isArrayBuffer(data)) {
66 data = Buffer.from(new Uint8Array(data));
67 } else if (utils.isString(data)) {
68 data = Buffer.from(data, 'utf-8');
69 } else {
70 return reject(createError(
71 'Data after transformation must be a string, an ArrayBuffer, a Buffer, or a Stream',
72 config
73 ));
74 }
75
76 if (config.maxBodyLength > -1 && data.length > config.maxBodyLength) {
77 return reject(createError('Request body larger than maxBodyLength limit', config));
78 }
79
80 // Add Content-Length header if data exists
81 if (!headerNames['content-length']) {
82 headers['Content-Length'] = data.length;
83 }
84 }
85
86 // HTTP basic authentication
87 var auth = undefined;
88 if (config.auth) {
89 var username = config.auth.username || '';
90 var password = config.auth.password || '';
91 auth = username + ':' + password;
92 }
93
94 // Parse url
95 var fullPath = buildFullPath(config.baseURL, config.url);
96 var parsed = url.parse(fullPath);
97 var protocol = parsed.protocol || 'http:';
98
99 if (!auth && parsed.auth) {
100 var urlAuth = parsed.auth.split(':');
101 var urlUsername = urlAuth[0] || '';
102 var urlPassword = urlAuth[1] || '';
103 auth = urlUsername + ':' + urlPassword;
104 }
105
106 if (auth && headerNames.authorization) {
107 delete headers[headerNames.authorization];
108 }
109
110 var isHttpsRequest = isHttps.test(protocol);
111 var agent = isHttpsRequest ? config.httpsAgent : config.httpAgent;
112
113 var options = {
114 path: buildURL(parsed.path, config.params, config.paramsSerializer).replace(/^\?/, ''),
115 method: config.method.toUpperCase(),
116 headers: headers,
117 agent: agent,
118 agents: { http: config.httpAgent, https: config.httpsAgent },
119 auth: auth
120 };
121
122 if (config.socketPath) {
123 options.socketPath = config.socketPath;
124 } else {
125 options.hostname = parsed.hostname;
126 options.port = parsed.port;
127 }
128
129 var proxy = config.proxy;
130 if (!proxy && proxy !== false) {
131 var proxyEnv = protocol.slice(0, -1) + '_proxy';
132 var proxyUrl = process.env[proxyEnv] || process.env[proxyEnv.toUpperCase()];
133 if (proxyUrl) {
134 var parsedProxyUrl = url.parse(proxyUrl);
135 var noProxyEnv = process.env.no_proxy || process.env.NO_PROXY;
136 var shouldProxy = true;
137
138 if (noProxyEnv) {
139 var noProxy = noProxyEnv.split(',').map(function trim(s) {
140 return s.trim();
141 });
142
143 shouldProxy = !noProxy.some(function proxyMatch(proxyElement) {
144 if (!proxyElement) {
145 return false;
146 }
147 if (proxyElement === '*') {
148 return true;
149 }
150 if (proxyElement[0] === '.' &&
151 parsed.hostname.substr(parsed.hostname.length - proxyElement.length) === proxyElement) {
152 return true;
153 }
154
155 return parsed.hostname === proxyElement;
156 });
157 }
158
159 if (shouldProxy) {
160 proxy = {
161 host: parsedProxyUrl.hostname,
162 port: parsedProxyUrl.port,
163 protocol: parsedProxyUrl.protocol
164 };
165
166 if (parsedProxyUrl.auth) {
167 var proxyUrlAuth = parsedProxyUrl.auth.split(':');
168 proxy.auth = {
169 username: proxyUrlAuth[0],
170 password: proxyUrlAuth[1]
171 };
172 }
173 }
174 }
175 }
176
177 if (proxy) {
178 options.headers.host = parsed.hostname + (parsed.port ? ':' + parsed.port : '');
179 setProxy(options, proxy, protocol + '//' + parsed.hostname + (parsed.port ? ':' + parsed.port : '') + options.path);
180 }
181
182 var transport;
183 var isHttpsProxy = isHttpsRequest && (proxy ? isHttps.test(proxy.protocol) : true);
184 if (config.transport) {
185 transport = config.transport;
186 } else if (config.maxRedirects === 0) {
187 transport = isHttpsProxy ? https : http;
188 } else {
189 if (config.maxRedirects) {
190 options.maxRedirects = config.maxRedirects;
191 }
192 transport = isHttpsProxy ? httpsFollow : httpFollow;
193 }
194
195 if (config.maxBodyLength > -1) {
196 options.maxBodyLength = config.maxBodyLength;
197 }
198
199 if (config.insecureHTTPParser) {
200 options.insecureHTTPParser = config.insecureHTTPParser;
201 }
202
203 // Create the request
204 var req = transport.request(options, function handleResponse(res) {
205 if (req.aborted) return;
206
207 // uncompress the response body transparently if required
208 var stream = res;
209
210 // return the last request in case of redirects
211 var lastRequest = res.req || req;
212
213
214 // if no content, is HEAD request or decompress disabled we should not decompress
215 if (res.statusCode !== 204 && lastRequest.method !== 'HEAD' && config.decompress !== false) {
216 switch (res.headers['content-encoding']) {
217 /*eslint default-case:0*/
218 case 'gzip':
219 case 'compress':
220 case 'deflate':
221 // add the unzipper to the body stream processing pipeline
222 stream = stream.pipe(zlib.createUnzip());
223
224 // remove the content-encoding in order to not confuse downstream operations
225 delete res.headers['content-encoding'];
226 break;
227 }
228 }
229
230 var response = {
231 status: res.statusCode,
232 statusText: res.statusMessage,
233 headers: res.headers,
234 config: config,
235 request: lastRequest
236 };
237
238 if (config.responseType === 'stream') {
239 response.data = stream;
240 settle(resolve, reject, response);
241 } else {
242 var responseBuffer = [];
243 var totalResponseBytes = 0;
244 stream.on('data', function handleStreamData(chunk) {
245 responseBuffer.push(chunk);
246 totalResponseBytes += chunk.length;
247
248 // make sure the content length is not over the maxContentLength if specified
249 if (config.maxContentLength > -1 && totalResponseBytes > config.maxContentLength) {
250 // stream.destoy() emit aborted event before calling reject() on Node.js v16
251 rejected = true;
252 stream.destroy();
253 reject(createError('maxContentLength size of ' + config.maxContentLength + ' exceeded',
254 config, null, lastRequest));
255 }
256 });
257
258 stream.on('aborted', function handlerStreamAborted() {
259 if (rejected) {
260 return;
261 }
262 stream.destroy();
263 reject(createError('error request aborted', config, 'ERR_REQUEST_ABORTED', lastRequest));
264 });
265
266 stream.on('error', function handleStreamError(err) {
267 if (req.aborted) return;
268 reject(enhanceError(err, config, null, lastRequest));
269 });
270
271 stream.on('end', function handleStreamEnd() {
272 try {
273 var responseData = responseBuffer.length === 1 ? responseBuffer[0] : Buffer.concat(responseBuffer);
274 if (config.responseType !== 'arraybuffer') {
275 responseData = responseData.toString(config.responseEncoding);
276 if (!config.responseEncoding || config.responseEncoding === 'utf8') {
277 responseData = utils.stripBOM(responseData);
278 }
279 }
280 response.data = responseData;
281 } catch (err) {
282 reject(enhanceError(err, config, err.code, response.request, response));
283 }
284 settle(resolve, reject, response);
285 });
286 }
287 });
288
289 // Handle errors
290 req.on('error', function handleRequestError(err) {
291 if (req.aborted && err.code !== 'ERR_FR_TOO_MANY_REDIRECTS') return;
292 reject(enhanceError(err, config, null, req));
293 });
294
295 // set tcp keep alive to prevent drop connection by peer
296 req.on('socket', function handleRequestSocket(socket) {
297 // default interval of sending ack packet is 1 minute
298 socket.setKeepAlive(true, 1000 * 60);
299 });
300
301 // Handle request timeout
302 if (config.timeout) {
303 // This is forcing a int timeout to avoid problems if the `req` interface doesn't handle other types.
304 var timeout = parseInt(config.timeout, 10);
305
306 if (isNaN(timeout)) {
307 reject(createError(
308 'error trying to parse `config.timeout` to int',
309 config,
310 'ERR_PARSE_TIMEOUT',
311 req
312 ));
313
314 return;
315 }
316
317 // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
318 // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
319 // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
320 // And then these socket which be hang up will devoring CPU little by little.
321 // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
322 req.setTimeout(timeout, function handleRequestTimeout() {
323 req.abort();
324 var transitional = config.transitional || defaults.transitional;
325 reject(createError(
326 'timeout of ' + timeout + 'ms exceeded',
327 config,
328 transitional.clarifyTimeoutError ? 'ETIMEDOUT' : 'ECONNABORTED',
329 req
330 ));
331 });
332 }
333
334 if (config.cancelToken || config.signal) {
335 // Handle cancellation
336 // eslint-disable-next-line func-names
337 onCanceled = function(cancel) {
338 if (req.aborted) return;
339
340 req.abort();
341 reject(!cancel || (cancel && cancel.type) ? new Cancel('canceled') : cancel);
342 };
343
344 config.cancelToken && config.cancelToken.subscribe(onCanceled);
345 if (config.signal) {
346 config.signal.aborted ? onCanceled() : config.signal.addEventListener('abort', onCanceled);
347 }
348 }
349
350
351 // Send the request
352 if (utils.isStream(data)) {
353 data.on('error', function handleStreamError(err) {
354 reject(enhanceError(err, config, null, req));
355 }).pipe(req);
356 } else {
357 req.end(data);
358 }
359 });
360}
361 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:162:11)
362 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
363 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
364 at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
365 at /workspace/node_modules/firebase-functions/lib/common/providers/https.js:334:22
366 at processTicksAndRejections (internal/process/task_queues.js:97:5)
367...["api.coingecko.com:443::::::::::::::::::"]},"keepAliveMsecs":1000,"maxFreeSockets":256,"scheduling":"fifo","keepAlive":false,"maxSockets":null},"_removedConnection":false,"writable":true},"status":200,"data":{"prices":[[1615345414698,37.27069164629981],[1615349310788,36.95627388647297],[1615352802175,37.48630338203377],[1615356202751,37.46442850999597],[1615360079361,37.642735963063906],[1615363905145,38.29435586902702],[1615367492353,38.313292928237594],[1615370461299,38.75503558097479],[1615374138056,38.24406575020552],[1615377815960,38.237026584388175],[1615381321332,38.93964664468625],[1615384813000,39.262646397955635],[1615388739874,39.15882057568881],[1615392094129,38.94488140309047],[1615395966875,38.79820936257378],[1615399312625,38.51637055616189],[1615403055037,38.59237008394828],[1615406529740,38.44087305010874],[1615410281814,37.71855645797291],[1615414278815,38.374824600586976],[1615417716420,38.4538669693684],[1615421045728,37.62772478442999],[1615425672990,36.8826465121472],[1615429587089,37.41958697414903],[1615432278494,37.34865694722488],[1615435254265,37.16289143388951],[1615439122292,37.14731463575248],[1615442523394,36.801517989796814],[1615446290102,37.02248224990424],[1615450361470,36.164787531097126],[1615453299572,36.46191265162147],[1615457172317,36.174755169666334],[1615460886498,37.05778010952229],[1615464298322,37.336909500902365],[1615469586325,37.56497212211488],[1615472126260,37.83046394206218],[1615474882979,37.252561357731096],[1615478498201,36.56190097084664],[1615482336185,36.83824760787625],[1615485957910,36.89351702770813],[1615489642151,37.589229946501746],[1615493390438,37.33184737771527],[1615496666244,37.29234576242379],[1615500577712,37.284260441548426],[1616866645601,1137195941.0307472],[1616870299925,1089416195.9864128],[1616873841648,1074341877.495249],[1616877368137,1061555457.3375872],[1616880970910,1077775411.1216433],[1616884693948,1064594490.6022671],[1616887998472,1087481667.611567],[1616891397951,1068140794.5197278],[1616894759953,1078753362.1719048],[1616898371565,1053546315.1245787],[1616902002474,1052498816.7223371],[1616905584364,1026915395.5541993],[1616909101481,1022271206.3215427],[1616912730390,997185793.1210617],[1616916434482,972130048.6316774],[1616919928611,988711196.2721183],[1616923534317,987299160.6191593],[1616926264719,975360472.6011684],[1616930074136,958327264.7346151],[1616933292776,935085970.8922312],[1616936940791,896217168.3654604],[1616940936847,878876312.6707534],[1616944090304,890504985.5476977],[1616948321869,896715385.5657766],[1616952007508,870767231.0865391],[1616955544207,880601758.4610194],[1616958381375,896794852.1077055],[1616962022167,929362788.5783823],[1616966479654,927502494.4691795],[1616969648773,880385481.5284289],[1616973545649,862329007.9935848],[1616977463095,840138544.6360805],[1616980359587,849727926.595521],[1616984356096,820616225.3306137],[1616987602367,898085663.0760688],[1616990444958,890215727.4112909],[1616995470635,914823340.6343507],[1616999032159,890922230.685704],[1617002651977,937214914.0703756],[1617005329558,976030203.3879734],[1617009370471,1061898884.4388478],[1617012348377,1111994349.2592206],[1617015705482,1175310227.1595278],[1617019895549,1217044915.3900926],[1617022941451,1204239369.9336267],[1617027118715,1225123359.1178432],[1617031210170,1191418570.9198012],[1617033728601,1257085051.9742537],[1617037882992,1261291734.3667347],[1617041858553,1265805909.4506621],[1617044547418,1261869965.5784621],[1617049418534,1225924891.220988],[1617052450394,1200646247.466799],[1617055896172,1209247034.0807025],[1617059684123,1249662106.3996315],[1617062561979,837849935.5380555],[1617066155823,1261094295.2039979],[1617070572708,1244044711.3556864],[1617074210159,1178503497.252399],[1617077106612,1184744920.254339],[1617080571662,1219164970.9205332],[1617084836477,1174744890.1399443],[1617087739776,1236332180.5454476],[1617092763739,1121685108.4046226],[1617096303391,1074005978.1362224],[1617100013739,1075898891.906641],[1617102136947,1041120230.0169744],[1617106411165,1021062028.7444541],[1617110588848,1004207600.6385714],[1617114148509,983098685.435342],[1617117449987,983878432.6976557],[1617120868725,943893192.0239582],[1617123806180,948379973.8680001],[1617128347360,948328240.0510467],[1617131244094,923477307.6495335],[1617134866719,918321070.6284192],[1617138697011,960178009.2986945],[1617142067857,974105207.7725881],[1617146083923,973959760.0729104],[1617149999086,959500047.5209063],[1617153094367,1007753562.6156206],[1617156698445,1021534121.1115336],[1617160175611,1028067427.0339341],[1617163928330,1007755251.8882328],[1617166924538,1023240773.0466446],[1617171886674,1037535813.1806505],[1617175133694,1101375379.7094195],[1617178435173,1136688478.90344],[1617182857658,1208366620.2561867],[1617185353773,1208823054.3509212],[1617188828477,1234197192.568771],[1617193393471,1707076315.380663],[1617196301983,1845668637.7358408],[1617199516026,1901877634.1385415],[1617203681947,2015292037.1305778],[1617207515426,2141098631.115179],[1617210224998,2343473154.2871637],[1617214323265,2329074198.4966955],[1617217968405,2461828129.1798186],[1617221653017,2493042958.539376],[1617224582971,2532015555.7692595],[1617228589364,2508661361.110037],[1617232204720,2590057969.924583],[1617235260464,2749780924.550207],[1617239367664,2791689438.967896],[1617243152558,2778422749.5901804],[1617246573894,2802892972.2612605],[1617250114952,2795446026.902383],[1617253276300,2837092221.188881],[1617257741390,2957061611.281718],[1617261111556,3025594776.954216],[1617264301698,3140730366.12618],[1617267704421,3230797741.627739],[1617272276500,3247001347.7404704],[1617275862720,3182990384.8873067],[1617279129292,2889317168.9977646],[1617283053665,2753527702.506779],[1617287046529,2700392654.8781624],[1617290204012,2616296684.424929],[1617293298853,2494255828.9768047],[1617296557242,2383424694.8900166],[1617301325511,2288268623.177356],[1617303766777,2297155897.636895],[1617307669347,2314935325.319679],[1617311721980,2259716784.056617],[1617314946823,2267889595.9127536],[1617319572007,2174169254.528509],[1617323182318,2097690604.8152165],[1617326033792,2110975746.1916978],[1617329489226,2126100629.800452],[1617332409284,2193182655.044224],[1617337211709,2199847063.5248647],[1617340611316,2167549077.601362],[1617344146863,2110348803.8388174],[1617347361962,2023115590.5637138],[1617351380142,1864316761.5098753],[1617354151186,1788973202.0040677],[1617359277447,1731207666.0376515],[1617361312976,1418566500.3106787],[1617366169158,1693859181.5518322],[1617369860769,1656689094.290342],[1617372306072,1660176536.7450612],[1617376754676,1722154482.4234965],[1617379285817,1915067128.493045],[1617383311995,1982773491.2907202],[1617387963188,1985155493.939231],[1617391564495,1827213471.6221747],[1617395202777,1932891922.7380657],[1617398214973,1937931474.560893],[1617401809690,1961473630.4188676],[1617405699909,1952347409.661483],[1617409553080,2172811188.054834],[1617412963837,2431917537.219363],[1617416445822,2666886575.1140027],[1617420431122,2769520722.4907126],[1617422613890,2797409323.779513],[1617427393260,2895546310.6951184],[1617431058021,2894169435.883223],[1617433696700,2651591430.614699],[1617437513773,3448548871.8910036],[1617441138039,3537764498.5278754],[1617444820385,3662623380.0181885],[1617448128419,3729999481.3895626],[1617452094944,3741683833.307362],[1617457034540,3761774670.321721],[1617460631688,3809173022.555833],[1617464335978,3711591162.8519845],[1617467879738,3759143118.4621553],[1617471447610,3693936894.7524076],[1617474960418,3833857114.2069917],[1617478639837,3888109113.59996],[1617482233320,3857034438.9984646],[1617485821346,3898924734.2645984],[1617489477282,3952661186.2182713],[1617493109729,4002501827.9437523],[1617495709286,3872814933.0218143],[1617499443431,3939579930.8108554],[1617503699037,3663106636.5813146],[1617507443725,3808705623.491391],[1617510706891,3786240536.055139],[1617512446242,3717882675.3539762],[1617516040645,3722966733.2957063],[1617519813304,3482249884.952562],[1617523351916,3345586253.508183],[1617526909722,3327000473.8244348],[1617530664916,3181835266.2617188],[1617534176048,3094776290.1306324],[1617537924632,3064167829.684326],[1617541493704,3112790145.252149],[1617545018360,2989449570.670528],[1617548594506,3016965749.017692],[1617552471191,2973530338.557288],[1617555933696,2759208177.1915674],[1617559387440,2662906186.1813793],[1617563034515,2521716547.9565806],[1617566483711,2454800946.788864],[1617570325792,2412175803.4922743],[1617573668989,2381142461.766321],[1617577282876,2228904400.2017546],[1617580896737,2203439508.717633],[1617584514686,2083961834.3200803],[1617588367701,1922511436.832222],[1617591869391,1816453643.1859522],[1617595346098,1783362433.1356776],[1617599069131,1767878927.408502],[1617602711113,1782121869.0062866],[1617606278078,1784322317.8294444],[1617609891135,1785304724.1970084],[1617613319383,1792007217.4012969],[1617617302304,1808002080.6732872],[1617620901014,1821923720.87615],[1617624265084,1769426364.6123836],[1617629555312,1731155926.337212],[1617631504259,1735378701.9021676],[1617635133537,1942437073.2385755],[1617638780500,1938122743.6976163],[1617642119732,1932182393.8447528],[1617645707597,1918416705.3436842],[1617649325384,1925855235.7182896],[1617653252063,1944708214.0244768],[1617656889033,1932665022.73478],[1617660329160,1943687775.1192245],[1617663683699,1971924479.2343264],[1617667435208,2101421530.2666874],[1617672769205,2175322213.812557],[1617674524812,2168578229.7784457],[1617678186353,2149217571.1759067],[1617681915267,2132725563.885806],[1617685469475,1907950838.2268875],[1617689189705,2026223167.4473426],[1617692670953,1991840998.8517568],[1617696101989,1958389716.0448081],[1617699877898,2027665770.2623076],[1617703590445,2045913908.1590445],[1617707076556,2057724347.183567],[1617710622851,1722203248.9530182],[1617714225215,2160140597.446546],[1617717905528,2192080372.5552874],[1617721488585,2199844279.449877],[1617724918808,2244159138.5689125],[1617728548093,2263548854.897557],[1617732187891,2106855536.9938018],[1617735969816,2268365061.664965],[1617739538518,1863113060.588111],[1617742875565,2296819840.9881096],[1617746516853,2308037223.56185],[1617750327052,2297405821.9954567],[1617754017835,2215648462.217197],[1617758617023,2112353884.9607923],[1617761085616,2094123582.0260437],[1617764518134,2101292245.7045105],[1617768287923,2104106865.0792534],[1617771810289,2127056476.4717],[1617775566730,2152196953.3590703],[1617778865860,2160666464.579131],[1617782881414,2201171213.1865735],[1617786249160,2203934869.139618],[1617789807394,2329117281.806726],[1617793383957,2333039138.899913],[1617796986959,2491205752.3653517],[1617800521125,2652604590.3673797],[1617804331429,2692817000.168284],[1617807822435,2121796914.212729],[1617811418506,2538097921.330415],[1617815037057,2572049083.87979],[1617818698211,2550478468.4248347],[1617822347031,2541491737.3311806],[1617825969097,2609118564.630648],[1617829326876,2651351577.1099257],[1617833305171,2429954572.560337],[1617837011298,2435043578.3313527],[1617840572965,2394428204.082167],[1617843841041,2446826032.07983],[1617848315742,2395082349.188743],[1617850339793,2376349751.741466],[1617852591890,2385498650.2366877],[1617855126472,2380054416.699361],[1617858732962,2424505564.216302],[1617862619633,2434391633.272485],[1617865876330,2410962812.9744062],[1617869233838,2516114320.406959],[1617872539799,2437748581.3302546],[1617876293610,2247205079.171164],[1617880005259,2149347865.150653],[1617883394235,1893777066.5168178],[1617886836203,1757412804.559377],[1617892197847,1668727963.8671286],[1617894162445,1631584545.4824028],[1617897737215,1596293896.725426],[1617901282046,1525523967.3370435],[1617905003853,1370316987.26801],[1617908631874,1358993841.079183],[1617912335250,1404691449.9164236],[1617915995319,1379405950.1047523],[1617919567600,1366246502.7408085],[1617923270275,1289254721.1461022],[1617926622919,1386402238.6371279],[1617930228859,1384851642.1789908],[1617933705891,1365548610.2907162],[1617937372258,1357266138.9978309],[1617941122560,1335764096.6047564],[1617944870896,1322495289.1105938],[1617948462328,1283751933.8339043],[1617951863802,1272489837.990008],[1617955666499,1259096045.8789752],[1617958890026,1247182948.0102005],[1617962609987,1220448763.9536679],[1617966256703,1222538618.147044],[1617969964555,1148194206.4734476],[1617973333279,1199996169.7479842],[1617977646106,1154935691.529977],[1617980504476,1144564005.003322],[1617984273306,1132822242.6037295],[1617987925282,1136733019.0246003],[1617991396077,1139090847.1565342],[1617994822351,1133169530.4839995],[1617998615234,1113274570.5832539],[1618002141094,1094805189.6349592],[1618005876460,1035579604.067034],[1618009282025,1090335224.3969038],[1618013035782,1063984405.5106469],[1618016519119,1058097513.8615906],[1618020114108,1065381128.0365001]]}}
368exports.getDatas = functions.https.onCall(async ({ name, ts1, ts2 }) => {
369 functions.logger.log(name);
370 // 👇 note the destructure here
371 const { data } = await axios.get(
372 `https://api.coingecko.com/api/v3/coins/${encodeURIComponent(name)}/market_chart/range`,
373 {
374 params: {
375 vs_currency: "usd",
376 from: ts1,
377 to: ts2,
378 }
379 }
380 );
381 functions.logger.log(data);
382 return { data };
383});
384
QUESTION
Flink missing windows generated on some partitions
Asked 2022-Feb-14 at 20:51I am trying to write a small Flink dataflow to understand more how it works and I am facing a strange situation where each time I run it, I am getting inconsistent outputs. Sometimes some records that I am expecting are missing. Keep in mind this is just a toy example I am building to learn the concepts of the DataStream API.
I have a dataset of around 7600 rows in CSV format like that look like this:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7
Full dataset here: https://pastebin.com/rknnRnPc
There are no special characters or quotes, so a simple String split will work fine.
The date range for each city spans from 28/06/2021 to 03/10/2021.
I am reading it using the DataStream API:
final DataStream<String> source = env.readTextFile("data.csv");
Each row is mapped to a simple POJO as follows:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52
The records are all in order of date, so I have this to set the event time and watermark:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55
I have a StreamingFileSink
on a Tuple4 to output the date range, city and average:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60
And finally I have the dataflow as follows:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60 source
61 .map(s -> s.split(",")) //split the CSV row into its fields
62 .filter(arr -> !arr[0].startsWith("Date")) // if it starts with Date it means it is the top header
63 .map(CityMetric::fromArray) //create the object from the fields
64 .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65 .filter(cm -> cm.getReading().equals("pm25")) // we want air quality of fine particulate matter pm2.5
66 .keyBy(CityMetric::getCity) // partition by city name
67 .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68 .aggregate(new CityAverageAggregate()) // average the values
69 .name("cityair")
70 .addSink(fileSink); //output each partition to a file
71
The CityAverageAggregate
just accumulates the sum and count, and keeps track of the earliest and latest dates of the range it is covering.
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60 source
61 .map(s -> s.split(",")) //split the CSV row into its fields
62 .filter(arr -> !arr[0].startsWith("Date")) // if it starts with Date it means it is the top header
63 .map(CityMetric::fromArray) //create the object from the fields
64 .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65 .filter(cm -> cm.getReading().equals("pm25")) // we want air quality of fine particulate matter pm2.5
66 .keyBy(CityMetric::getCity) // partition by city name
67 .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68 .aggregate(new CityAverageAggregate()) // average the values
69 .name("cityair")
70 .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72 implements AggregateFunction<
73 CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4<LocalDate, LocalDate, String, Double>> {
74
75 @Override
76 public AverageAccumulator createAccumulator() {
77 return new AverageAccumulator();
78 }
79
80 @Override
81 public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82 return averageAccumulator.add(
83 cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84 }
85
86 @Override
87 public Tuple4<LocalDate, LocalDate, String, Double> getResult(
88 AverageAccumulator averageAccumulator) {
89 return Tuple4.of(
90 averageAccumulator.getStart(),
91 averageAccumulator.getEnd(),
92 averageAccumulator.getCity(),
93 averageAccumulator.average());
94 }
95
96 @Override
97 public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98 return acc1.merge(acc2);
99 }
100
101 public static class AverageAccumulator {
102 private final String city;
103 private final LocalDate start;
104 private final LocalDate end;
105 private final long count;
106 private final double sum;
107
108 public AverageAccumulator() {
109 city = "";
110 count = 0;
111 sum = 0;
112 start = null;
113 end = null;
114 }
115
116 AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117 this.city = city;
118 this.count = count;
119 this.sum = sum;
120 this.start = start;
121 this.end = end;
122 }
123
124 public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125 //make sure our dataflow is correct and we are summing data from the same city
126 if (!this.city.equals("") && !this.city.equals(city)) {
127 throw new IllegalArgumentException(city + " does not match " + this.city);
128 }
129
130 return new AverageAccumulator(
131 city,
132 earliest(this.start, eventDate),
133 latest(this.end, eventDate),
134 this.count + 1,
135 this.sum + value);
136 }
137
138 public AverageAccumulator merge(AverageAccumulator that) {
139 LocalDate mergedStart = earliest(this.start, that.start);
140 LocalDate mergedEnd = latest(this.end, that.end);
141 return new AverageAccumulator(
142 this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143 }
144
145 private LocalDate earliest(LocalDate d1, LocalDate d2) {
146 if (d1 == null) {
147 return d2;
148 } else if (d2 == null) {
149 return d1;
150 } else {
151 return d1.isBefore(d2) ? d1 : d2;
152 }
153 }
154
155 private LocalDate latest(LocalDate d1, LocalDate d2) {
156 if (d1 == null) {
157 return d2;
158 } else if (d2 == null) {
159 return d1;
160 } else {
161 return d1.isAfter(d2) ? d1 : d2;
162 }
163 }
164
165 public double average() {
166 return sum / count;
167 }
168
169 public String getCity() {
170 return city;
171 }
172
173 public LocalDate getStart() {
174 return start;
175 }
176
177 public LocalDate getEnd() {
178 return end;
179 }
180 }
181}
182
Problem:
The problem I am facing is that sometimes I do not get all the windows I am expecting. This does not always happen, sometimes consecutive runs output a different result, so I am suspecting there is some race condition somewhere.
For example, in one of the partition file output I sometimes get:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60 source
61 .map(s -> s.split(",")) //split the CSV row into its fields
62 .filter(arr -> !arr[0].startsWith("Date")) // if it starts with Date it means it is the top header
63 .map(CityMetric::fromArray) //create the object from the fields
64 .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65 .filter(cm -> cm.getReading().equals("pm25")) // we want air quality of fine particulate matter pm2.5
66 .keyBy(CityMetric::getCity) // partition by city name
67 .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68 .aggregate(new CityAverageAggregate()) // average the values
69 .name("cityair")
70 .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72 implements AggregateFunction<
73 CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4<LocalDate, LocalDate, String, Double>> {
74
75 @Override
76 public AverageAccumulator createAccumulator() {
77 return new AverageAccumulator();
78 }
79
80 @Override
81 public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82 return averageAccumulator.add(
83 cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84 }
85
86 @Override
87 public Tuple4<LocalDate, LocalDate, String, Double> getResult(
88 AverageAccumulator averageAccumulator) {
89 return Tuple4.of(
90 averageAccumulator.getStart(),
91 averageAccumulator.getEnd(),
92 averageAccumulator.getCity(),
93 averageAccumulator.average());
94 }
95
96 @Override
97 public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98 return acc1.merge(acc2);
99 }
100
101 public static class AverageAccumulator {
102 private final String city;
103 private final LocalDate start;
104 private final LocalDate end;
105 private final long count;
106 private final double sum;
107
108 public AverageAccumulator() {
109 city = "";
110 count = 0;
111 sum = 0;
112 start = null;
113 end = null;
114 }
115
116 AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117 this.city = city;
118 this.count = count;
119 this.sum = sum;
120 this.start = start;
121 this.end = end;
122 }
123
124 public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125 //make sure our dataflow is correct and we are summing data from the same city
126 if (!this.city.equals("") && !this.city.equals(city)) {
127 throw new IllegalArgumentException(city + " does not match " + this.city);
128 }
129
130 return new AverageAccumulator(
131 city,
132 earliest(this.start, eventDate),
133 latest(this.end, eventDate),
134 this.count + 1,
135 this.sum + value);
136 }
137
138 public AverageAccumulator merge(AverageAccumulator that) {
139 LocalDate mergedStart = earliest(this.start, that.start);
140 LocalDate mergedEnd = latest(this.end, that.end);
141 return new AverageAccumulator(
142 this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143 }
144
145 private LocalDate earliest(LocalDate d1, LocalDate d2) {
146 if (d1 == null) {
147 return d2;
148 } else if (d2 == null) {
149 return d1;
150 } else {
151 return d1.isBefore(d2) ? d1 : d2;
152 }
153 }
154
155 private LocalDate latest(LocalDate d1, LocalDate d2) {
156 if (d1 == null) {
157 return d2;
158 } else if (d2 == null) {
159 return d1;
160 } else {
161 return d1.isAfter(d2) ? d1 : d2;
162 }
163 }
164
165 public double average() {
166 return sum / count;
167 }
168
169 public String getCity() {
170 return city;
171 }
172
173 public LocalDate getStart() {
174 return start;
175 }
176
177 public LocalDate getEnd() {
178 return end;
179 }
180 }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195
While sometimes I get the full set:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60 source
61 .map(s -> s.split(",")) //split the CSV row into its fields
62 .filter(arr -> !arr[0].startsWith("Date")) // if it starts with Date it means it is the top header
63 .map(CityMetric::fromArray) //create the object from the fields
64 .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65 .filter(cm -> cm.getReading().equals("pm25")) // we want air quality of fine particulate matter pm2.5
66 .keyBy(CityMetric::getCity) // partition by city name
67 .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68 .aggregate(new CityAverageAggregate()) // average the values
69 .name("cityair")
70 .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72 implements AggregateFunction<
73 CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4<LocalDate, LocalDate, String, Double>> {
74
75 @Override
76 public AverageAccumulator createAccumulator() {
77 return new AverageAccumulator();
78 }
79
80 @Override
81 public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82 return averageAccumulator.add(
83 cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84 }
85
86 @Override
87 public Tuple4<LocalDate, LocalDate, String, Double> getResult(
88 AverageAccumulator averageAccumulator) {
89 return Tuple4.of(
90 averageAccumulator.getStart(),
91 averageAccumulator.getEnd(),
92 averageAccumulator.getCity(),
93 averageAccumulator.average());
94 }
95
96 @Override
97 public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98 return acc1.merge(acc2);
99 }
100
101 public static class AverageAccumulator {
102 private final String city;
103 private final LocalDate start;
104 private final LocalDate end;
105 private final long count;
106 private final double sum;
107
108 public AverageAccumulator() {
109 city = "";
110 count = 0;
111 sum = 0;
112 start = null;
113 end = null;
114 }
115
116 AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117 this.city = city;
118 this.count = count;
119 this.sum = sum;
120 this.start = start;
121 this.end = end;
122 }
123
124 public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125 //make sure our dataflow is correct and we are summing data from the same city
126 if (!this.city.equals("") && !this.city.equals(city)) {
127 throw new IllegalArgumentException(city + " does not match " + this.city);
128 }
129
130 return new AverageAccumulator(
131 city,
132 earliest(this.start, eventDate),
133 latest(this.end, eventDate),
134 this.count + 1,
135 this.sum + value);
136 }
137
138 public AverageAccumulator merge(AverageAccumulator that) {
139 LocalDate mergedStart = earliest(this.start, that.start);
140 LocalDate mergedEnd = latest(this.end, that.end);
141 return new AverageAccumulator(
142 this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143 }
144
145 private LocalDate earliest(LocalDate d1, LocalDate d2) {
146 if (d1 == null) {
147 return d2;
148 } else if (d2 == null) {
149 return d1;
150 } else {
151 return d1.isBefore(d2) ? d1 : d2;
152 }
153 }
154
155 private LocalDate latest(LocalDate d1, LocalDate d2) {
156 if (d1 == null) {
157 return d2;
158 } else if (d2 == null) {
159 return d1;
160 } else {
161 return d1.isAfter(d2) ? d1 : d2;
162 }
163 }
164
165 public double average() {
166 return sum / count;
167 }
168
169 public String getCity() {
170 return city;
171 }
172
173 public LocalDate getStart() {
174 return start;
175 }
176
177 public LocalDate getEnd() {
178 return end;
179 }
180 }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195(2021-06-28,2021-06-30,Belgrade,48.666666666666664)
196(2021-07-01,2021-07-07,Belgrade,41.142857142857146)
197(2021-07-08,2021-07-14,Belgrade,52.857142857142854)
198(2021-07-15,2021-07-21,Belgrade,56.0)
199(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
200(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
201(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
202(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
203(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
204(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
205(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
206(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
207(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
208(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
209(2021-09-30,2021-10-03,Belgrade,61.5)
210
Is there anything evidently wrong in my dataflow pipeline? Can't figure out why this would happen. It doesn't always happen on the same city either.
What could be happening?
UPDATE
So it seems that when I disabled Watermarks the problem didn't happen any more. I changed the WatermarkStrategy to the following:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60 source
61 .map(s -> s.split(",")) //split the CSV row into its fields
62 .filter(arr -> !arr[0].startsWith("Date")) // if it starts with Date it means it is the top header
63 .map(CityMetric::fromArray) //create the object from the fields
64 .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65 .filter(cm -> cm.getReading().equals("pm25")) // we want air quality of fine particulate matter pm2.5
66 .keyBy(CityMetric::getCity) // partition by city name
67 .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68 .aggregate(new CityAverageAggregate()) // average the values
69 .name("cityair")
70 .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72 implements AggregateFunction<
73 CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4<LocalDate, LocalDate, String, Double>> {
74
75 @Override
76 public AverageAccumulator createAccumulator() {
77 return new AverageAccumulator();
78 }
79
80 @Override
81 public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82 return averageAccumulator.add(
83 cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84 }
85
86 @Override
87 public Tuple4<LocalDate, LocalDate, String, Double> getResult(
88 AverageAccumulator averageAccumulator) {
89 return Tuple4.of(
90 averageAccumulator.getStart(),
91 averageAccumulator.getEnd(),
92 averageAccumulator.getCity(),
93 averageAccumulator.average());
94 }
95
96 @Override
97 public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98 return acc1.merge(acc2);
99 }
100
101 public static class AverageAccumulator {
102 private final String city;
103 private final LocalDate start;
104 private final LocalDate end;
105 private final long count;
106 private final double sum;
107
108 public AverageAccumulator() {
109 city = "";
110 count = 0;
111 sum = 0;
112 start = null;
113 end = null;
114 }
115
116 AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117 this.city = city;
118 this.count = count;
119 this.sum = sum;
120 this.start = start;
121 this.end = end;
122 }
123
124 public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125 //make sure our dataflow is correct and we are summing data from the same city
126 if (!this.city.equals("") && !this.city.equals(city)) {
127 throw new IllegalArgumentException(city + " does not match " + this.city);
128 }
129
130 return new AverageAccumulator(
131 city,
132 earliest(this.start, eventDate),
133 latest(this.end, eventDate),
134 this.count + 1,
135 this.sum + value);
136 }
137
138 public AverageAccumulator merge(AverageAccumulator that) {
139 LocalDate mergedStart = earliest(this.start, that.start);
140 LocalDate mergedEnd = latest(this.end, that.end);
141 return new AverageAccumulator(
142 this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143 }
144
145 private LocalDate earliest(LocalDate d1, LocalDate d2) {
146 if (d1 == null) {
147 return d2;
148 } else if (d2 == null) {
149 return d1;
150 } else {
151 return d1.isBefore(d2) ? d1 : d2;
152 }
153 }
154
155 private LocalDate latest(LocalDate d1, LocalDate d2) {
156 if (d1 == null) {
157 return d2;
158 } else if (d2 == null) {
159 return d1;
160 } else {
161 return d1.isAfter(d2) ? d1 : d2;
162 }
163 }
164
165 public double average() {
166 return sum / count;
167 }
168
169 public String getCity() {
170 return city;
171 }
172
173 public LocalDate getStart() {
174 return start;
175 }
176
177 public LocalDate getEnd() {
178 return end;
179 }
180 }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195(2021-06-28,2021-06-30,Belgrade,48.666666666666664)
196(2021-07-01,2021-07-07,Belgrade,41.142857142857146)
197(2021-07-08,2021-07-14,Belgrade,52.857142857142854)
198(2021-07-15,2021-07-21,Belgrade,56.0)
199(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
200(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
201(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
202(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
203(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
204(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
205(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
206(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
207(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
208(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
209(2021-09-30,2021-10-03,Belgrade,61.5)
210 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
211 WatermarkStrategy.<CityMetric>noWatermarks()
212 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
213
And so far I have been getting consistent results. When I checked the documentation it says that:
static WatermarkStrategy noWatermarks()
Creates a watermark strategy that generates no watermarks at all. This may be useful in scenarios that do pure processing-time based stream processing.
But I am not doing processing-time based stream processing, I am doing event-time processing.
Why would forMonotonousTimestamps()
have the strange behaviour I was seeing? Indeed my timestamps are monotonically increasing (the noWatermarks
strategy wouldn't work if they weren't), but somehow changing this does not work well with my scenario.
Is there anything I am missing with the way things work in Flink?
ANSWER
Answered 2022-Feb-14 at 20:51Flink doesn't support per-key watermarking. Each parallel task generates watermarks independently, based on observing all of the events flowing through that task.
So the reason this isn't working with the forMonotonousTimestamps
watermark strategy is that the input is not actually in order by timestamp. It is temporally sorted within each city, but not globally. This is then going to result in some records being late, but unpredictably so, depending on exactly when watermarks are generated. These late events are being ignored by the windows that should contain them.
You can address this in a number of ways:
(1) Use a forBoundedOutOfOrderness
watermark strategy with a duration sufficient to account for the actual out-of-order-ness in the dataset. Given that the data looks something like this:
1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9 private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("dd/MM/yyyy");
10
11 private final LocalDate localDate;
12 private final String country;
13 private final String city;
14 private final String reading;
15 private final int count;
16 private final double min;
17 private final double max;
18 private final double median;
19 private final double variance;
20
21 private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22 this.localDate = localDate;
23 this.country = country;
24 this.city = city;
25 this.reading = reading;
26 this.count = count;
27 this.min = min;
28 this.max = max;
29 this.median = median;
30 this.variance = variance;
31 }
32
33 public static CityMetric fromArray(String[] arr) {
34 LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35 int count = Integer.parseInt(arr[4]);
36 double min = Double.parseDouble(arr[5]);
37 double max = Double.parseDouble(arr[6]);
38 double median = Double.parseDouble(arr[7]);
39 double variance = Double.parseDouble(arr[8]);
40
41 return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42 }
43
44 public long getTimestamp() {
45 return getLocalDate()
46 .atStartOfDay()
47 .toInstant(ZoneOffset.UTC)
48 .toEpochMilli();
49 }
50
51//getters follow
52 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
53 WatermarkStrategy.<CityMetric>forMonotonousTimestamps() //we know they are sorted by time
54 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
55 final StreamingFileSink<Tuple4<LocalDate, LocalDate, String, Double>> fileSink =
56 StreamingFileSink.forRowFormat(
57 new Path("airquality"),
58 new SimpleStringEncoder<Tuple4<LocalDate, LocalDate, String, Double>>("UTF-8"))
59 .build();
60 source
61 .map(s -> s.split(",")) //split the CSV row into its fields
62 .filter(arr -> !arr[0].startsWith("Date")) // if it starts with Date it means it is the top header
63 .map(CityMetric::fromArray) //create the object from the fields
64 .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65 .filter(cm -> cm.getReading().equals("pm25")) // we want air quality of fine particulate matter pm2.5
66 .keyBy(CityMetric::getCity) // partition by city name
67 .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68 .aggregate(new CityAverageAggregate()) // average the values
69 .name("cityair")
70 .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72 implements AggregateFunction<
73 CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4<LocalDate, LocalDate, String, Double>> {
74
75 @Override
76 public AverageAccumulator createAccumulator() {
77 return new AverageAccumulator();
78 }
79
80 @Override
81 public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82 return averageAccumulator.add(
83 cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84 }
85
86 @Override
87 public Tuple4<LocalDate, LocalDate, String, Double> getResult(
88 AverageAccumulator averageAccumulator) {
89 return Tuple4.of(
90 averageAccumulator.getStart(),
91 averageAccumulator.getEnd(),
92 averageAccumulator.getCity(),
93 averageAccumulator.average());
94 }
95
96 @Override
97 public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98 return acc1.merge(acc2);
99 }
100
101 public static class AverageAccumulator {
102 private final String city;
103 private final LocalDate start;
104 private final LocalDate end;
105 private final long count;
106 private final double sum;
107
108 public AverageAccumulator() {
109 city = "";
110 count = 0;
111 sum = 0;
112 start = null;
113 end = null;
114 }
115
116 AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117 this.city = city;
118 this.count = count;
119 this.sum = sum;
120 this.start = start;
121 this.end = end;
122 }
123
124 public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125 //make sure our dataflow is correct and we are summing data from the same city
126 if (!this.city.equals("") && !this.city.equals(city)) {
127 throw new IllegalArgumentException(city + " does not match " + this.city);
128 }
129
130 return new AverageAccumulator(
131 city,
132 earliest(this.start, eventDate),
133 latest(this.end, eventDate),
134 this.count + 1,
135 this.sum + value);
136 }
137
138 public AverageAccumulator merge(AverageAccumulator that) {
139 LocalDate mergedStart = earliest(this.start, that.start);
140 LocalDate mergedEnd = latest(this.end, that.end);
141 return new AverageAccumulator(
142 this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143 }
144
145 private LocalDate earliest(LocalDate d1, LocalDate d2) {
146 if (d1 == null) {
147 return d2;
148 } else if (d2 == null) {
149 return d1;
150 } else {
151 return d1.isBefore(d2) ? d1 : d2;
152 }
153 }
154
155 private LocalDate latest(LocalDate d1, LocalDate d2) {
156 if (d1 == null) {
157 return d2;
158 } else if (d2 == null) {
159 return d1;
160 } else {
161 return d1.isAfter(d2) ? d1 : d2;
162 }
163 }
164
165 public double average() {
166 return sum / count;
167 }
168
169 public String getCity() {
170 return city;
171 }
172
173 public LocalDate getStart() {
174 return start;
175 }
176
177 public LocalDate getEnd() {
178 return end;
179 }
180 }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195(2021-06-28,2021-06-30,Belgrade,48.666666666666664)
196(2021-07-01,2021-07-07,Belgrade,41.142857142857146)
197(2021-07-08,2021-07-14,Belgrade,52.857142857142854)
198(2021-07-15,2021-07-21,Belgrade,56.0)
199(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
200(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
201(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
202(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
203(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
204(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
205(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
206(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
207(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
208(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
209(2021-09-30,2021-10-03,Belgrade,61.5)
210 final WatermarkStrategy<CityMetric> cityMetricWatermarkStrategy =
211 WatermarkStrategy.<CityMetric>noWatermarks()
212 .withTimestampAssigner((cityMetric, l) -> cityMetric.getTimestamp());
21303/10/2021,GR,Athens,pressure,60,1017.9,1040.6,1020.9,542.4
21428/06/2021,US,Atlanta,co,24,1.4,7.3,2.2,19.05
215
that will require an out-of-order-ness duration of approximately 100 days.
(2) Configure the windows to have sufficient allowed lateness. This will result in some of the windows being triggered multiple times -- once when the watermark indicates they can close, and again each time a late event is added to the window.
(3) Use the noWatermarks
strategy. This will lead to the job only producing results if and when it reaches the end of its input file(s). For a continuous streaming job this wouldn't be workable, but for finite (bounded) inputs this can work.
(4) Run the job in RuntimeExecutionMode.BATCH
mode. Then the job will only produce results at the end, after having consumed all of its input. This will run the job with a more optimized runtime designed for batch workloads, but the outcome should be the same as with (3).
(5) Change the input so it isn't out-of-order.
QUESTION
Do I need a JAR file to run a Flink application?
Asked 2022-Jan-20 at 21:29I am working through the book Stream Processing with Apache Flink
by Fabian Hueske and Vasiliki Kalavri.
The book includes an example Flink application and I want to figure out how to run the file. It is a scala file located here in their github repo.
Must I turn it into a JAR file before I run it? If so, how do I convert the file to JAR?
ANSWER
Answered 2022-Jan-20 at 21:29Except for SQL queries submitted with Flink‘s SQL client, a user needs to package a JAR file. Usually, a Flink program file can be packaged using a Maven or Gradle project:
QUESTION
Calling Hibernate in Spring cloud Stream
Asked 2022-Jan-03 at 10:17I'm new to Spring cloud stream.
Say I Spring cloud stream app that listen to some topic from kafka using @StreamListener("input-channel").
I want to do some calculation and send the result to another topic but in the middle of the processing I also need to call Hibernate (via spring data jpa) to persist some data to my mySQL data base.
Is it valid to call Hibernate in the middle of stream processing? is there other pattern to do it?
ANSWER
Answered 2022-Jan-03 at 10:17Yes, it's a database call, so why not. People do it all the time.
Also, @StreamListener
, has been deprecated for 3 years now, and is already removed from the new versions, so please transition to functional programming model
QUESTION
Filtering in Kafka and other streaming technologies
Asked 2021-Dec-27 at 07:46I am currently doing some research about which stream processing technology to use. So far I have looked at message queueing technologies and streaming frameworks. I am now leaning towards Apache Kafka or Google Pub/Sub.
The requirements I have:
- Deliver, read and process messages/events in real time.
- Persistence in the messages/events.
- Ability to filter messages/event in real time with out having to read entire topic. For example: if I have topic called ‘details’, I want to be able to filter out the messages/events out of that topic where an attribute of an event equals a certain value.
- Ability to see if the producer to a certain topic or queue is finished.
- Ability to delete messages/events in a topic based on an attribute within an event equaling a certain value.
- Ordering in messages/events.
My question is: what is the best framework/technology for these use cases? From what I have read so far, Kafka doesn’t provide that out of the boxes filtering approach for messages/events in topics and Google Pub/Sub does have a filter approach.
Any suggestions and experience would be welcome.
ANSWER
Answered 2021-Dec-27 at 07:46As per the requirements you mentioned kafka seems a nice fit, using kafka streams or KSQL you can perform filtering in real-time, here is an example https://kafka-tutorials.confluent.io/filter-a-stream-of-events/confluent.html
What you need is more than just integration and data transfer, you need something similar to what is known as ETL tool, here you can find more about ETL and tools in GCP https://cloud.google.com/learn/what-is-etl
QUESTION
Pexpect Multi-Threading Idle State
Asked 2021-Dec-10 at 14:52We have ~15,000 nodes to log into and pull data from via Pexpect. To speed this up, I am doing multiprocessing - splitting the load equally between 12 cores. That works great, but this is still over 1000 nodes per core - processed one at a time.
The CPU utilization of each core as it does this processing is roughly 2%. And that sort of makes sense, as most of the time is just waiting for to see the Pexpect expect value as the node streams output. To try and take advantage of this and speed things up further, I want to implement multi-threading within the multi-processing on each core.
To attempt avoid any issues with shared variables, I put all data needed to log into a node in a dictionary (one key in dictionary per node), and then slice the dictionary, with each thread receiving a unique slice. Then after the threads are done, I combine the dictionary slices back together.
However, I am still seeing one thread completely finish before moving to the next.
I am wondering what constitutes an idle state such that a core can be moved to work on another thread? Does the fact that it is always looking for the Pexpect expect value mean it is never idle?
Also, as I use the same target function for each thread. I am not sure if perhaps that target function being the same for each thread (same vars local to that function) is influencing this?
My multi-threading code is below, for reference.
Thanks for any insight!
1import threading
2import <lots of other stuff>
3
4class ThreadClass(threading.Thread):
5 def __init__(self, outputs_dict_split):
6 super(ThreadClass, self).__init__()
7 self.outputs_dict_split = outputs_dict_split
8 def run(self):
9 outputs_dict_split = get_output(self.outputs_dict_split)
10 return outputs_dict_split
11
12def get_output(outputs_dict):
13 ### PEXPECT STUFF TO LOGIN AND RUN COMMANDS ####
14 ### WRITE DEVICE'S OUTPUTS TO DEVICE'S OUTPUTS_DICT RESULTS SUB-KEY ###
15
16def backbone(outputs_dict):
17 filterbykey = lambda keys: {x: outputs_dict[x] for x in keys}
18 num_threads = 2
19 device_split = np.array_split(list(outputs_dict.keys()), num_threads)
20
21 outputs_dict_split_list = []
22 split_list1 = list(device_split[0])
23 split_list2 = list(device_split[1])
24 outputs_dict_split1 = filterbykey(split_list1)
25 outputs_dict_split2 = filterbykey(split_list2)
26 t1 = ThreadClass(outputs_dict_split1)
27 t2 = ThreadClass(outputs_dict_split2)
28 t1.start()
29 t2.start()
30 t1.join()
31 t2.join()
32 outputs_dict_split1 = t1.outputs_dict_split
33 outputs_dict_split2 = t2.outputs_dict_split
34 outputs_dict_split_list.append(outputs_dict_split1)
35 outputs_dict_split_list.append(outputs_dict_split2)
36 outputs_dict = ChainMap(*outputs_dict_split_list)
37
38 ### Downstream Processing ###
39
ANSWER
Answered 2021-Dec-10 at 14:52This actually worked. However, I had to scale the number of devices being processed in order to see substantial improvements in overall processing time.
QUESTION
Apache Flink - how to stop and resume stream processing on downstream failure
Asked 2021-Nov-22 at 04:53I have a Flink application that consumes incoming messages on a Kafka topic with multiple partitions, does some processing then sends them to a sink that sends them over HTTP to an external service. Sometimes the downstream service is down the stream processing needs to stop until it is back in action.
There are two approaches I am considering.
- Throw an exception when the Http sink fails to send the output message. This will cause the task and job to restart according to the configured restart strategy. Eventually the downstream service will be back and the system will continue where it left off.
- Have the Sink sleep and retry on failure; it can do this continually until the downstream service is back.
From what I understand and from my PoC, with 1. I will lose exactly-least once guarantees since the sink itself is external state. As far as I can see, you cannot make a simple HTTP endpoint transactional, as it needs to be to implement TwoPhaseCommitSinkFunction.
With 2. this is less of an issue since pipeline will not proceed until the sink makes a successful write, and I can rely on back pressure throughout the system to pause the retrieval of messages from the Kafka source.
The main questions I have are:
- Is it a correct assumption that you can't make a TwoPhaseCommitSinkFunction for a simple HTTP endpoint?
- Which of the two strategies, or neither, makes the most sense?
- Am I missing simpler obvious solutions?
ANSWER
Answered 2021-Nov-22 at 04:53I think you can try AsyncIO in Flink - https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/asyncio/.
Try to make the HTTP endpoint send a response once all operation has been done for the request, e.g. In http server, the process for the request has been done and the result has been committed to DB. Then use a http async client in AsyncIO operator. The AsyncIO operator will wait until the response is received by the operator. If any error happened, the Flink streaming pipeline will fail and restart the pipeline based on recovery strategy.
All requests to HTTP endpoint without receiving response will be in the internal buffer of AsyncIO operator, and once streaming pipeline failed, the requests pending in the buffer will be saved in the checkpoint state. It will also trigger back pressure when the internal buffer is full.
QUESTION
Hazelcast IMDG vs Hazelcast Jet config
Asked 2021-Oct-12 at 22:08How to read Hazlecast IMDG data in the Hazelcast jet.
In my case I required both Hazlecast IMDG (Distributed cache) to store data for the future and also jet to perform batch and stream processing.
So I will be saving data using Hazelcast IMDG(MapStore) and filtering using Hazelcast jet.
1public class Test {
2
3 HazelcastInstance hz = Hazelcast.newHazelcastInstance();
4 JetInstance jet = Jet.newJetInstance();
5
6 public static void main(String[] args) {
7 Test t = new Test();
8 t.loadIntoIMap();
9 t.readFromIMap();
10 }
11
12 public void loadIntoIMap() {
13 IMap<String, String> map = hz.getMap("my-distributed-map");
14 // Standard Put and Get
15 map.put("1", "John");
16 map.put("2", "Mary");
17 map.put("3", "Jane");
18 }
19
20 public void readFromIMap() {
21 System.err.println("--manu---");
22 jet.getMap("s").put("1", "2");
23 System.err.println(jet.getMap("s").size());
24 System.err.println(jet.getMap("my-distributed-map").size());
25 }
26
27}
28
29
Do we need separate configuration both(jet and IMDG) or in single config can I share Hz IMap data inside jet.
I'm little confused between jet and Hazelcast IMDG
ANSWER
Answered 2021-Oct-12 at 22:08The answer differs depending on the version you want to use.
IMDG up to 4.2 and Jet 4.5
Hazelcast Jet is built on top of Hazelcast IMDG. When you start a Jet instance there is automatically an IMDG instance running. There is JetInstance#getHazelcastInstance
method to retrieve the IMDG instance from Jet instance and JetConfig#setHazelcastConfig
to configure IMDG specific configs.
You can access the maps from your cluster in Jet using com.hazelcast.jet.pipeline.Sources#map(String)
You should not start both IMDG and Jet separately on the same machine. However, you can create 2 clusters, one IMDG, one Jet and connect from Jet using com.hazelcast.jet.pipeline.Sources#remoteMap(String, ClientConfig)
and similar for other data structures.
If you are already using Hazelcast it's likely this version.
Hazelcast 5.0
With a recent 5.0 release these two products were merged together. There is a single artefact to use - com.hazelcast:hazelcast
. You just create a Hazelcast instance and, if enabled, you can get the Jet engine from there using HazelcastInstance#getJet
5.0 is 100 % compatible with IMDG 4.2, just change the dependency and mostly compatible with Jet 4.5, some code changes are needed though.
QUESTION
What is benefit of using Kafka streams?
Asked 2021-Oct-07 at 15:28I try to realize what is benefit of using Kafka stream in my business model. The customers publish an order and instantly gets offers from sellers who are online and intrested in this order.
In this case the streams are fit to join available sellers(online) to order stream and filter, sorting (by price) of offers. So as result the customer should give the best offers by price by request.
I discovered only one benefit: it is less of server calls(all calculations happends in stream).
My question is, why streams are matter in this case? Because I can make these business steps using the standard approach with the one monolithic application?
I know this question is opinion based, but after reading some books about stream processing it is still to hard change the mind on this approach.
ANSWER
Answered 2021-Oct-07 at 15:28only one benefit: it is less of server calls
Kafka Streams can still do "server calls", especially when using Interactive Queries with an RPC layer. Fetching data from a remote table, such as KSQLdb, is also a "server call".
This is not the only benefit. Have you tried to write a join between topics using regular consumer API? Or a filter/map in less than 2 lines of code (outside the config setup)?
can make these business steps using the standard approach with the one monolithic application?
A Streams topology can still be emdedded within a monolith, so I don't understand your point here. I assume you mean a fully synchronous application with a traditional database + API layer?
The books you say you've read should go over most benefits of stream processing, but you might want to checkout "Kafka Streams in Action" to specifically get the advantages of that
QUESTION
How to filter data in Kafka?
Asked 2021-Sep-29 at 09:50My understanding is I can filter data using stream and put it to specific topics.
Problem : The producer sends data with field country. Then stream processing filters these data and puts to topics by country code.
As result those consumers who are subscribed to specific countries(code) would get message.
Problem is it requires a lot of topics by count of countries. And in the feature I need to do the same with countries.
How to organize it in Kafka and filter data?
ANSWER
Answered 2021-Sep-29 at 09:50You have few options here :
Kafka Streaming : With kafka streaming you can filter data as per your need and write it to the new topics. Consumers can consume messages from those new topics.
Filter Data on the Consumer Side : You consume the data and filter the data as per required criteria on the consumer side.
Use Separate partitions for separate country code : You define total partitions of this topic as per the number of country codes and make country code as key. Now make your consumers direct to right partition for consuming country specific messages.
Community Discussions contain sources that include Stack Exchange Network
Tutorials and Learning Resources in Stream Processing
Tutorials and Learning Resources are not available at this moment for Stream Processing