Updated – Get SharePoint List Data … Fast

While storing your data in a database is recommended, many citizen developers do not have the access or skills to create databases. As a result, many valuable business solutions are based on top of SharePoint Lists, and all that data needs to be analyzed too. A previous post shared some M code to pull data from SharePoint lists faster than the native connector (especially V1). Please read that article to get all the context, as this post is only to explain and address two gaps in the previous code (highlighted in comments, and my own experience with it):

  1. The query does not allow Scheduled Refresh in the Power BI Service
  2. Duplicate items are returned for lists with deleted items

These two issues are addressed in the M code below. Actually, two versions are provided. Both are provided in case one is more easily adaptable to your use case. Also, while the first may be slightly faster, the List.Generate approach is more elegant (so I wanted to share that one too). Like before there are commented lines of code to demonstrate the needed syntax for different scenarios (e.g., when you need to expand data from Person and/or Lookup columns).

The first issue was straight forward to address, as it is a problem many users pulling web data face. Scheduled Refresh does not allow for dynamic sources, so the updated code below uses the RelativePath option in Web.Contents. This keeps the base url static for authentication, while the RelativePath updates dynamically. Note that when first setting up Scheduled Refresh, it may be necessary to check the "Skip Test Connection" box to get it to work.

To address the duplicate items issue, a bigger change was needed. The previous approach checks for the number of list items (or, with code provided in a helpful comment, the max ID) and generates a list of numbers incremented by 5000 (the max # of items that can be returned in a single call) up to that number. Each number is then used in the $skip parameter, along with a $top value of 5000. Say a list has 10,000 items, but the first two items were previously deleted. This would mean that the first remaining item has an ID value of 3 and the last item has an ID of 10,002. With a $skip value of 0 and $top of 5000, items 3 through 5002 would be returned. In the second API call ($skip = 5000), two of the same items would be returned (5001 and 5002). Instead of using $top and $skip, the new version below uses $filter on the ID column. This way guarantees that only the items between the specified ID values are returned.

let
  siteurl = "https://<your tenant url>/sites/<your site name>",  // use ... /sites/<your site name>/<your subsite name> if applicable                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
  listname = "<your list name>",
  itemcount = Json.Document(Web.Contents(siteurl, [RelativePath = "_api/web/lists/GetByTitle('" & listname & "')/items?$select=ID&$orderby=ID%20desc&$top=1", Headers = [Accept = "application/json"]]))[value]{0}[ID],
  StartIDs = List.Numbers(0, Number.RoundUp(itemcount / 5000), 5000),
  ConvertToTable = Table.FromList(StartIDs, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  Add_EndIDs = Table.AddColumn(ConvertToTable, "Addition", each [Column1] + 4999, type number),
  RenamedColumns = Table.RenameColumns(Add_EndIDs, {{"Column1", "StartID"}, {"Addition", "EndID"}}),
  #"Changed Type" = Table.TransformColumnTypes(RenamedColumns, {{"StartID", type text}, {"EndID", type text}}),
  //Comment in only one of the fieldselect lines below, defining your select and expand columns using the example syntax shown                                                                                                                            
  fieldselect = "&$top=5000",  // all fields with no expansion                                                                                                                                                                                                                         
  //fieldselect = "&$top=5000&$select=ID,Title,Date,PersonColumn,ChoiceColumn,LookupColumn", // list desired fields (no expansion) -No Spaces!                                                                                                                                            
  //fieldselect = "&$top=5000&$select=ID,Title,Date,PersonColumn/LastName,PersonColumn/FirstName,ChoiceColumn,LookupColumn/Title,LookupColumn/Project,LookupColumn/ProjectStatus&$expand=PersonColumn,LookupColumn", //expand list fields - No Spaces!                                                                                                                                                                                                                                                    
  GetData = Table.AddColumn(#"Changed Type", "Items", each Json.Document(Web.Contents(siteurl, [RelativePath = "_api/web/lists/GetByTitle('" & listname & "')/items?$filter=(ID ge " & [StartID] & ") and (ID le " & [EndID] & ")" & fieldselect, Headers = [Accept = "application/json"]]))[value]),
  #"Removed Other Columns" = Table.SelectColumns(GetData, {"Items"}),
  #"Expanded Items" = Table.ExpandListColumn(#"Removed Other Columns", "Items")
in
  #"Expanded Items"

A second (and, in my opinon, more elegant) approach is possible with List.Generate. When pagination is set to true in the API call, a link to the next set of data is provided with each call (until the last items are returned). List.Generate can be used to recursively use the nextLink URL until it is empty. Please see the version below. In my testing, data are returned quickly for both. Theoretically, the List.Generate approach would be slower as every call is sequential (the results of each call need to be checked to see if another call is needed). With the above version, API calls for different ID ranges could be made in parallel.

let
  siteurl = "https://<your tenant url>/sites/<your site name>",  // use ... /sites/<your site name>/<your subsite name> if applicable                                                                                                                                                                                                            
  listname = "<your list name>",
  //Comment in only one of the fieldselect lines below, defining your select and expand columns using the example syntax shown                                                                                                                            
  fieldselect = "&$top=5000", // all fields with no expansion                                                              
  //fieldselect = "&$top=5000&$select=ID,Title,Date,PersonColumn,ChoiceColumn,LookupColumn", // list desired fields (no expansion) -No Spaces!                                                                                                                                            
  //fieldselect = "&$top=5000&$select=ID,Title,Date,PersonColumn/LastName,PersonColumn/FirstName,ChoiceColumn,LookupColumn/Title,LookupColumn/Project,LookupColumn/ProjectStatus&$expand=PersonColumn,LookupColumn",  //expand list fields - No Spaces!                                                                                                   
  InitialWebCall = Json.Document(Web.Contents(siteurl, [RelativePath = "_api/web/lists/GetByTitle('" & listname & "')/items?$skipToken=Paged=TRUE" & fieldselect, Headers = [Accept = "application/json"]])),
  datalist = List.Generate(() => InitialWebCall, each List.Count([value]) > 0, each try Json.Document(Web.Contents(siteurl, [RelativePath = "_api" & Text.AfterDelimiter([odata.nextLink], "_api"), Headers = [Accept = "application/json"]])) otherwise [value = {}], each [value]),
  #"Converted to Table" = Table.FromList(datalist, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  #"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1")
in
  #"Expanded Column1"

If you use either version, please share your feedback in the comments.

You May Also Like

About the Author: Pat Mahoney

Pat Mahoney is a chemist in Indiana who enjoys teaching others to use Power BI and other tools, and providing pro bono data science help/instruction to nonprofits, schools, etc.

2 Comments

  1. Thanks Pat, the List.Generate solution works great!

    My final hurdle is trying to figure out how to get the text values of choice fields as the "FieldValuesAsText" option is not generated. I know I can create a relationship with another table, but is there any other way you know of as I have a large number of these?

    1. Glad it's working for you, Justin. The demo list I used in the video also includes a choice column and it returned the text value (not the index). Is it a lookup column by chance and you need to use the $expand syntax shown in the 3rd fieldselect option?

Leave a Reply

Your email address will not be published. Required fields are marked *