Dataloader
What is it
Dataloader sit on top of your data access layer and will de-dupe outgoing requests using debouncing or cache the data response using memoization.
Every new request creates a new dataloader.
Purpose: Solves N+1 problem in Graphql.
- Because data is fetched at a field-level, we run the risk of overfetching and
N+1queries.
/* in middleware file */ const DataLoader = require('dataloader'); const userLoader = new DataLoader(keys => ( myBatchGetUsers(keys) ));
Because dataloader is created/batches per data entity (e.g. User table, Books table), each entity can be from different sources.
- e.g.
UserfromSQL DB,BooksfromNoSQL DB
Architecture Layer
Dataloader(s) are defined in the resolvers.
If you have a frequently ran query, you can also place them in the middleware layer.
Then the loaded definitions are called in the resolvers.
Middleware in Express example:
/* module */ function dataLoadersMiddleware(ctx, next) { // dataloader definition as a middleware function // create loaders, put into context } /* app.ts */ app // .use(otherMiddleware) .use(dataLoadersMiddleware) // .use(someOtherMiddlewares) const server = new ApolloServer({ typeDefs, resolvers, context: () => { return { someFieldLoader: new DataLoader(async keys => { const someField = await fetch('someSource') return someField; }) } } }) /* resolver.js */ async function f(parent, args, ctx) { const f = await ctx.dataLoaders.fLoader.load(args.fId); return { ...f, gs: await ctx.dataLoaders.gLoader.loadMany(f.gs), } } async function g(self, args, ctx) { return { ...self, hs: await ctx.dataLoaders.hLoader.loadMany(self.hs), } } // etc...
Illustration
You can have a situation where a nested field could cause duplicate parent-level field requests.
query { name attendee { events { title attendees { name } } } }
If attendees can attend more than one event, we'd encounter overfetching from making duplicate requests for the same attendees attending multiple events.
- e.g. attendee
Samattends two events and GraphQL would query forSam'snametwice or more becauseSammay be attending multiple events.
We can fix this by making the resolver lighter at the parent level and shifting the resolver function to the fields.
- e.g. resolver for each
events,title,attendeesfields instead of one resolver atevents.- this will make child fields responsible for fetching its own data.
Dataloader leverages this structure breakdown to batch and de-dupe requests.
Field name difference between service layer data response and Graphql data response is mapped in dataloader.
- Because
dataloaderonly cares about actual queries generated based onkeys.