[dbsql] InsertMany Row Batching #212
Conversation
… Implementations Signed-off-by: hfuss <hayden.fuss@kaleido.io>
a09135f to
28bffa8
Compare
Signed-off-by: hfuss <hayden.fuss@kaleido.io>
Signed-off-by: hfuss <hayden.fuss@kaleido.io>
| // Use a single multi-row insert | ||
| sequences := make([]int64, len(instances)) | ||
| sequences := allSequences[offset:end] | ||
| err := c.DB.InsertTxRows(ctx, c.Table, tx, insert, func() { |
There was a problem hiding this comment.
Noting - if trace logs are enabled, this log line is ridiculously massive when you're inserting in bulk.
Not even useful/readable for mortals or AI:
const maxTraceArgs = 100
func limitTraceArgs(args []interface{}) string {
if len(args) <= maxTraceArgs {
return fmt.Sprintf("%+v", args)
}
return fmt.Sprintf("%+v ...and %d more", args[:maxTraceArgs], len(args)-maxTraceArgs)
}
func limitTraceSeqs(seqs []int64) string {
if len(seqs) <= maxTraceArgs {
return fmt.Sprintf("%v", seqs)
}
return fmt.Sprintf("%v ...and %d more", seqs[:maxTraceArgs], len(seqs)-maxTraceArgs)
}
func (s *Database) InsertTxRows(ctx context.Context, table string, tx *TXWrapper, q sq.InsertBuilder, postCommit func(), sequences []int64, requestConflictEmptyResult bool) error {
l := log.L(ctx)
// ...
l.Tracef(`SQL-> insert query: %s (args: %s)`, sqlQuery, limitTraceArgs(args))
// ...
}could be an approach we should take ? Just whats the limit ?
There was a problem hiding this comment.
We'll tackle this another time
Signed-off-by: hfuss <hayden.fuss@kaleido.io>
peterbroadhurst
left a comment
There was a problem hiding this comment.
Thanks @onelapahead - I approve of the latest commit to remove the config setting.
Think it's unclear that a config setting of that shape makes sense, given this is a fundamental characteristic we're talking about with the max inserts (rather than a tuning option).
Provider implementations (which FF common doesn't really come with right now) for PSQL and others can have their own config settings.
Also has the benefit of not affecting the builds of consumers, including FF core, FFTM etc.
When doing:
Your parameter count in SQL is equal to
row * columns.With bulk inserts for 1000s of rows, with 10+ columns, its easy to end up exceeding the PostgreSQL limit on parameters which is uint16's max -
65535.So, drawing inspiration from LFDT-Paladin/paladin#644, we also add "batching" of multi-row inserts based on a
MaxPlaceholdersfeature, which will chunk the rows across multipleINSERT's to ensure we're underneath the limit. Not as efficient as PostgreSQL'sUNNESTfrom a performance perspective, but still helps avoid ff-common users from hitting DB limits when inserting in bulk.