Skip to content
This repository has been archived by the owner on Dec 27, 2019. It is now read-only.

how do i get this to work with pg@7? #28

Open
jonathanong opened this issue Jul 24, 2017 · 12 comments
Open

how do i get this to work with pg@7? #28

jonathanong opened this issue Jul 24, 2017 · 12 comments
Assignees

Comments

@jonathanong
Copy link

seems like pg@7 always returns a promise now, and the promise doesn't seem to resolve for me

@brianc
Copy link
Owner

brianc commented Jul 24, 2017

Hey @jonathanong - could you include a sample of code that reproduces the issue? I'd love to take a look at it for you.

@jonathanong
Copy link
Author

jonathanong commented Jul 26, 2017

using a new file using npm i --save pg pg-query-stream:

const QueryStream = require('pg-query-stream')
const { Pool } = require('pg')

const client = new Pool()

const result = client.query(new QueryStream('SELECT * FROM users'))

console.log(typeof result.then === 'function') // returns `true`

Promise.resolve(result).then((stream) => {
  stream.on('data', x => console.log(x))
  .on('error', e => console.error(e.stack))
  .on('end', () => {
    console.log('finished')
    process.exit(1)
  })
}).catch(e => {
  console.error(e.stack)
  process.exit(1)
})

it never exits or logs anything except true.

@qas
Copy link

qas commented Aug 3, 2017

Would be nice to see this resolved soon! 😄

@Suor
Copy link

Suor commented Aug 4, 2017

@jonathanong it shouldn't return Promise in the first place, it should return stream.

@Suor
Copy link

Suor commented Aug 4, 2017

Looks like this is Pool.query() fault, it doesn't care about handling submittable.

@brianc
Copy link
Owner

brianc commented Aug 4, 2017

Yah sounds like an oversight on my part w/ pg-pool. Sorry about that!

@brianc brianc self-assigned this Aug 6, 2017
@brianc
Copy link
Owner

brianc commented Aug 7, 2017

For now, use pool.connect to acquire a client, then pass the stream to the client.query method.

const client = await pool.connect()
const stream = client.query(new QueryStream('select *'))
stream.on('end', client.release)
stream.pipe(res) // ..etc

pool.query has slightly different semantics in that it has to first establish a connection to the backend which could throw an error before your submittable is even dispatched. I need to do some re-working on the internals of pg-query-stream and pg-cursor to support pool.query and I don't have the time right away.

If you or your company benefit from node-postgres and have the means, please consider supporting my work on Patreon.

@brianc brianc closed this as completed Aug 7, 2017
@brianc brianc reopened this Aug 7, 2017
@brianc
Copy link
Owner

brianc commented Aug 7, 2017

whoops didn't mean to close this

@brandonros
Copy link

brandonros commented Jan 29, 2018

In case anybody wants a full snippet:

(async function() {
  var pool = new pg.Pool({
    user: 'brandon',
    host: '192.168.0.2',
    database: 'db',
    password: '',
    port: 5432,
  });

  var client = await pool.connect();

  var app = express();

  app.get('/', function(req, res) {
    console.log(new Date(), 'Starting...');

    var start = now();

    var stream = client.query(new QueryStream(sql));

    stream.on('end', function() {
      var end = now();

      console.log(new Date(), 'Finished...', end - start);

      client.release();
    });

    stream.pipe(JSONStream.stringify()).pipe(res);
  });

  app.listen(5000, function() {
    console.log('Listening...');
  });
})();

I'm also a little confused on the proposed performance benefits for this package.

I ran the above code against this code:

(async function() {
  var pool = new pg.Pool({
    user: 'brandon',
    host: '192.168.0.2',
    database: 'db',
    password: '',
    port: 5432,
  });

  var client = await pool.connect();

  var app = express();

  app.get('/', function(req, res) {
    console.log(new Date(), 'Starting...');

    var start = now();
    
    client.query(sql, function(err, sqlRes) {
      res.send(JSON.stringify(sqlRes.rows));

      var end = now();

      console.log(new Date(), 'Finished...', end - start);
    });
  });

  app.listen(5000, function() {
    console.log('Listening...');
  });
})();

and I did not see any performance benefits.

Correct me if I am mistaken, but I thought the idea behind this package (and streams in general) would be that the client could receive the first byte of the response in a much quicker fashion.

In the non-stream version, the code has to get all resulting rows for the query, stringify them, then send it.

In the stream-version, the code should be streaming the rows, while stringifying on the fly, to the client.

brandon@foo:~$ time curl http://localhost:5000/ -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2397k    0 2397k    0     0  1609k      0 --:--:--  0:00:01 --:--:-- 1610k

real	0m1.501s
user	0m0.016s
sys	0m0.008s
brandon@foo:~$ time curl http://localhost:5000/ -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2390k  100 2390k    0     0  1682k      0  0:00:01  0:00:01 --:--:-- 1683k

real	0m1.432s
user	0m0.008s
sys	0m0.004s

The differences I see are negligible, and if anything, the streaming version is slightly slower (which would make sense, given the output)

I thought the semi-massive 2.5mb query response I was getting would be perfect for this streaming package... but, maybe 3.7k rows isn't enough?

@ChrisKozak
Copy link

@brandonros processing a stream can give you the performance benefits you talk about. For example, analyzing the twitter firehose. You'd want to process them as they come in. But I think that the primary purpose of streaming is to give your server a constant memory footprint regardless of the amount of data being processed.

If you load all of the results before returning, your memory consumption will grow linearly. For some large result set you will exceed the memory available to your application and you'll start seeing problems; app crash, disk thrashing, etc.

If instead, you stream the results through your server in constant batches, your memory consumption will remain constant for any sized result set.

Make sense?

@brandonros
Copy link

brandonros commented Mar 26, 2018 via email

@nguyenhmp
Copy link

in terms of performance benefits, there is always a trade off. In this case, if you had 1million rows, you would have to load that 1 million row in memory. With a stream, you could load only 10,000 rows at a time. With this approach however, you do have a trade off of keeping the stream up and present @brandonros

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants