Get Data From SharePoint Lists … But Fast

Note: A better approach is described in a more recent post. The new version allows for Scheduled Refresh in the Power BI Service, and avoids the issue described in the comments below with duplicate items. See new post here.

SharePoint lists remain a convenient way to store data. However, get data from SharePoint lists with lots of items can be frustratingly slow. The updated SharePoint list connector improved things, but not completely. The video below walks through a way to use the Web connector with the SharePoint REST API to get data much faster (hours -> minutes, minutes -> seconds).

Below are the function and query versions of the M code described in the video to get your SharePoint list data much faster. Enjoy!

Function Version

(tenant_name, site_name, list_name) =>
   let
     tenantname = tenant_name,
     sitename = site_name,  // if a subsite use "Site/SubSite"                                                                                                      
     listname = list_name,
     baseurl = "https://"
       & tenantname
       & "/sites/"
       & sitename
       & "/_api/web/lists/GetByTitle('"
       & listname
       & "')/",
     itemcount = Json.Document(
       Web.Contents(baseurl & "ItemCount", [Headers = [Accept = "application/json"]])
     )[value],
     skiplist = List.Numbers(0, Number.RoundUp(itemcount / 5000), 5000),
     #"Converted to Table" = Table.FromList(
       skiplist,
       Splitter.SplitByNothing(),
       null,
       null,
       ExtraValues.Error
     ),
     #"Renamed Columns" = Table.RenameColumns(#"Converted to Table", {{"Column1", "Skip"}}),
     #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns", {{"Skip", type text}}),
     fieldselect = "&$top=5000",  // all fields with no expansion                                                                                             
     //fieldselect = "&$top=5000&$select = Id,Title,Person,Date", // list desired fields (no expansion)                                                                                                  
     //fieldselect = "&$top=5000&$select = Id,Title,Person/LastName,Person/FirstName,Date&$expand = Person", //expand list fields                                                                                                                            
     #"Added Custom" = Table.AddColumn(
       #"Changed Type",
       "Items",
       each Json.Document(
         Web.Contents(
           baseurl & "/items?$skipToken=Paged=TRUE%26p_ID=" & [Skip] & fieldselect,
           [Headers = [Accept = "application/json"]]
         )
       )
     ),
     #"Expanded Items" = Table.ExpandRecordColumn(#"Added Custom", "Items", {"value"}, {"value"}),
     #"Expanded value" = Table.ExpandListColumn(#"Expanded Items", "value")
   in
     #"Expanded value"


Query Version

let
   sitename = "",  // if a subsite use "Site/SubSite"                                                                                                      
   listname = "BigList",
   baseurl = "https:///sites/"
     & sitename
     & "/_api/web/lists/GetByTitle('"
     & listname
     & "')/",
   itemcount = Json.Document(
     Web.Contents(baseurl & "ItemCount", [Headers = [Accept = "application/json"]])
   )[value],
   skiplist = List.Numbers(0, Number.RoundUp(itemcount / 5000), 5000),
   #"Converted to Table" = Table.FromList(
     skiplist,
     Splitter.SplitByNothing(),
     null,
     null,
     ExtraValues.Error
   ),
   #"Renamed Columns" = Table.RenameColumns(#"Converted to Table", {{"Column1", "Skip"}}),
   #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns", {{"Skip", type text}}),
   fieldselect = "&$top=5000",  // all fields with no expansion                                                                                             
   //fieldselect = "&$top=5000&$select = Id,Title,Person,Date", // list desired fields (no expansion)                                                                                                  
   //fieldselect = "&$top=5000&$select=Id,Title,Choice,LookupColumn/Title,LookupColumn/Project,LookupColumn/ProjectStatus,Date,Person/LastName,Person/FirstName,Person/EMail&$expand=LookupColumn,Person",                                                                                                                                                                                                       
   Custom1 = Table.AddColumn(
     #"Changed Type",
     "Items",
     each Json.Document(
       Web.Contents(
         baseurl & "/items?$skipToken=Paged=TRUE%26p_ID=" & [Skip] & fieldselect,
         [Headers = [Accept = "application/json"]]
       )
     )
   ),
   #"Expanded Items" = Table.ExpandRecordColumn(Custom1, "Items", {"value"}, {"value"}),
   #"Expanded value" = Table.ExpandListColumn(#"Expanded Items", "value")
 in
   #"Expanded value"

You May Also Like

About the Author: Pat Mahoney

Pat Mahoney is a chemist in Indiana who enjoys teaching others to use Power BI and other tools, and providing pro bono data science help/instruction to nonprofits, schools, etc.

17 Comments

  1. Hi Pat! I’m using the code above to successfully gather data from a number of SharePoint lists our company keeps. However, when I publish the report to PowerBI Service I am unable to schedule a refresh because the queries contain “dynamic data sources.” Have you encountered this before and do you know if it can be fixed? My suspicion is that the api calls are dynamic in the sense that they rely on the ItemCount to create a skip list that will be iterated through. Thank you!

    1. Hello, I have the same problem. The solution is fantastic in terms of the speed of loading, but it does not let me schedule the update in the Power BI service, is there a solution? Thank you!

  2. Pat, thank you so much for this! Worked like a charm for a dashboard we use. Instead of 1.5-2 hours, it now finishes a refresh within a minute.

  3. Hi Pat,
    I noticed that the query didn't retrieve all items in some cases. Changed itemcount to the following:
    itemcount = Json.Document(Web.Contents( baseurl & "items?$select=ID&$orderby=ID%20desc&$top=1", [Headers = [Accept = "application/json"]]))[value]{0}[ID],

    This helps when the you deleted items from the list, and thus the IDs are getting bigger than the actual item count.

    1. Rene you are a genius! Thank you for posting that fix. I came across the anomaly today and checked back here in case someone had come across it and Voila!!!

  4. Except don't do this mod on a site's hidden User Information List – it gives an item count of millions when the actual count is just hundreds. For my non-special lists, I am using Rene's mod but for the special site Users Information list I'm using Pat's original.

  5. Hi Pat!
    Thank you for a terrific post. I have just sent the last 20 minutes making dinner plans while importing a mere 5400 lines from a list. Do you have any tips on how to tweak the query to suit a local SharePoint site rather than SharePoint online?

    My original query looks similar but but the list URL is a combination of site and object ID
    Source= SharePoint.Tables("https://xxx.yyy.zz/site/", [ApiVersion = 15]),
    #"11bb732d-164e-44ff-bca1-f4c523981efa" = Source{[Id="11bb732d-164e-44ff-bca1-f4c523981efa"]}[Items],

    Best regards,
    Fredrik

  6. Pat,

    This code works perfectly fine when using PBI Desktop. However, when I publish it and refresh data, it does not occur since the data is recognized as dynamic. Is it possible to be solved?

  7. Unfortunately, using either ItemCount or "Max" of desc ID divided by 5000 do not generate correct results. In my tests, i got duplicate records when items were deleted from the list.

    p_ID refers to the MAX Id of the last paged set returned. So if in your first 5000 records, the last ID (max id) is 5900 (ie 900 records were deleted so you have 5000 records numbered between 1 and 5900), then the p_ID should be 5900, not 5000.

    This means that on your second call, setting p_ID to 5000 instead of 5900, will result in duplicate data being returned.

    if p_ID was a pointer to the 5000 position record, this would work; but since p_ID is the SharePoint ID of the item, it does not; at least in my tests using SharePoint Online as of July 14, 2022.

    Also a good read is this blog about skipToken:
    joemcshea.intellipointsolutions.com / pagination-in-sharepoint-rest-requests-using-top-and-skiptoken

    If anyone find a graceful solution to this i'd love to hear. I ended up hard-coding 5 batch calls of 5000 using greater than/less than then appending the tables together…but in my case, if the IDs grow beyond 25,000 i'll have to add an additional call

  8. Hi Pat, this is AMAZING! thanks for sharing. I did a copmarison for a SharePoint list with 59K items with one lookup, one Person and One Choice and here are the result of refresh time in Power BI Desktop:
    – REST API: 00:00:18.1
    – V2: 00:00.59.2
    – V1: 04:45:09.2 "Failed to save modifications to the server. Error returned: 'OLE
    DB or ODBC error: [DataSource.Error] SharePoint: Request failed: The remote
    server returned an error: (429). (429 TOO MANY REQUESTS).
    '."

    The only question is I tried to add the Created By into REST API by adding one of the following but it didn't worked.
    Author, Author/Title, Author/title, Author/Name, Created By, Created By/title

    How can I add the Created By and Modified By to the records?

  9. Is there anyway to be able to use Scheduled Refresh for the REST API?
    After publishing the datase to the Service I am getting the following error for refresh and scheduled refresh:
    This dataset includes a dynamic data source. Since dynamic data sources aren't refreshed in the Power BI service, this dataset won't be refreshed.

  10. This is working fine… ????
    When you upload this in to the Power Bi Service it won't refresh because of the Dynamic Source issue.

    To overcome that I used the Chris Webb's RelativePath method and now its working like a charm. ?

    When loading this custom1 table try this

    sitename = "https://contoso.sharepoint.com/sites/mysite",

    Custom1 = Table.AddColumn(#"Changed Type", "Items", each Json.Document(Web.Contents(
    sitename,
    [
    RelativePath = "/_api/web/lists/GetByTitle('" & listname & "')/" & "/items?$skiptoken=Paged=TRUE%26p_ID=" &[Skip] & fieldselect, Headers=[Accept="application/json"]]))),

  11. If some of the items are deleted within the SP list, this count method not working properly. I did a little change to this by doing the folling.
    ??I took the first ID and the last ID numbers and populate the skiplist table using those numbers, in 5000 chunks

    ❤❤❤❤❤❤❤❤❤❤❤

    FirstId = Json.Document(Web.Contents(baseurl&"items?$top=1&$orderby=ID asc", [Headers=[Accept="application/json"]]))[value]{0}[ID],
    lastId = Json.Document(Web.Contents(baseurl&"items?$top=1&$orderby=ID desc", [Headers=[Accept="application/json"]]))[value]{0}[ID],

    skiplist = List.Numbers(FirstId-1, Number.RoundUp((lastId-FirstId)/5000), 5000),

    //Then using the relative path option custom table populated

    Custom1 = Table.AddColumn(#"Changed Type", "Items", each Json.Document(Web.Contents(
    sitename,
    [
    RelativePath = "/_api/web/lists/GetByTitle('" & listname & "')/" & "/items?$skiptoken=Paged=TRUE%26p_ID=" &[Skip] & fieldselect, Headers=[Accept="application/json"]]))),
    ❤❤❤❤❤❤❤❤❤

    Then just after expand remember to remove the duplicates of the ID, because when some records are deleted on the SharePoint List the same item will load manytimes with this method

    1. Thanks for the comment. FYI that I updated this post to include a link to a new post with some updated code that addresses the duplicate items and scheduled refresh issue.

Leave a Reply

Your email address will not be published. Required fields are marked *