Scraping Review Link information

Hello,
I’m working on a script to scrape the data in review links so I can share the assets that are in review links on a csv with my team.

I’ve gotten to the point where I have a list of dictionaries with all the information I want. I’m quite sure I’m just missing a glaringly obvious thing with iterating through lists and dictionaries, but I’ve been starting at the same thing for so long, I thought I would ask for some assistance.

Looking at this info, I’m hoping to dig into the children and get a few pieces for the spreadsheet:

I had this little loop going:

children_in_review_link = {}
for i in range(len(items_in_review_link)):
    #pprint.pprint(items_in_review_link[i])
    if items_in_review_link[i][0]['asset']['children'] != []:
        review_link_children.append(items_in_review_link[i][0]['asset']['children'])
        print(items_in_review_link[i][0]['asset']['children'])
        for j in range(len(items_in_review_link[i][0]['asset']['children'])):
            children_in_review_link = {
                'child_name': items_in_review_link[i][0]['asset']['children'][j]['name'],
                'child_type': items_in_review_link[i][0]['asset']['children'][j]['type'],
                'child_size': items_in_review_link[i][0]['asset']['children'][j]['filesize'],
                'child_filetype': items_in_review_link[i][0]['asset']['children'][j]['filetype'],
            }
            review_link_children.append(children_in_review_link)

pprint.pprint(review_link_children)

and it indeed does not work! I get a “list out of range” error for the line if items_in_review_link[i][0]['asset']['children'] != []:

I had put in the pprint.pprint(items_in_review_link[i][0] line as a check, and it kicks back a “list index out of range” error on the last item in the range.

Any direction / help would be greatly appreciated!

Nancy

Are all the assets in a list under the children key?

If so could you just run something like this?

review_link_children = []
for item in overall_review_payload:
     children_list = item['asset]['children']
     for child in children_list:
               try:
                   children_in_review_link = {'child_name': child['name'], 'child_type': child['type']}
               except KeyError as e:
                   print("Doing something with my key errors")
                if children_in_review_link:
                   review_link_children.append(children_in_review_link)

Not sure of any of the quirks you’re seeing in the payload, but this seems like the simple way of pulling those data points and placing them in a list

Maybe you could add something to check if the children key list isn’t empty too

(ignore the indents and formatting btw… :grin:)

1 Like

Just wanted to say an actual thanks for helping out.
I discovered that there was a broken review link and that’s why I kept getting range errors.
My next step will be a more sophisticated try / except for finding those!

That would definitely do it!

Feel free to follow-up on this thread if you’re still having issues, and I can maybe throw together some more code samples for you.

1 Like

Thanks!

A couple questions:
-Is there a check I can run at the start of the loop to look for / skip broken links?
-Is it possible to do the same type of scraping for presentation links? I went to the API Reference Docs (https://developer.frame.io/api/reference/) and received 404 File Not Found though I had remembered there was a "https://api.frame.io/v2/projects/" + project_id + "/presentations" call available. From there, is there a way to scrape the presentation link data to see what assets are associated with the link?

Thanks for your help!