Extracting Data From OpenAir Without API Access

I had a recent need to pull a lot of data out of OpenAir.  There was a requirement to audit some data specific to each employee of the organization.

Ordinarily this sort of task would come with API access to the system in question, and it would be fairly trivial to retrieve the required data and offload it to my workstation for the requisite processing.

Unfortunately, I do not have API access to the OpenAir instance in question. Furthermore, the instance is access through Okta, which adds an additional layer of abstraction to the issue.  Without the Okta layer in place, I might be able to goose it directly from a script. 

So how do we access hundreds of pages of data on a website that sits behind another website, and which provides no documented API access?

Let’s try Selenium.

The Okta issue is actually pretty easy to solve.  If we tell Selenium to navigate to the Okta login page, and feed the appropriate credentials to the relevant form elements, it’ll log us in to the Okta instance.

Please note that in the script below, we’re storing the credentials in a separate file

Here’s a quick one.   Fetch a list of all projects in a Jira Cloud instance, then fetch a list of all of the issues in each project.  Paginate through the resulting list of issues, and for each issue write the issue key and issue status to a CSV file.

 import requests
import json
import base64
import csv

cloud_username = "<email>"
cloud_token = "<token>"
cloud_url = "<cloud URL>"

def credentials_encode(username, password):
    credentials_string = f'{username}:{password}'
    input_bytes = credentials_string.encode('utf-8')
    encoded_bytes = base64.b64encode(input_bytes)
    encoded_string = encoded_bytes.decode('utf-8')
    return encoded_string

encoded_cloud_credentials = credentials_encode(cloud_username, cloud_token)
# Encode the credentials that we provided

request_headers = {
    'Authorization': f'Basic {encoded_cloud_credentials}',
    'Content-Type': 'application/json',
    'Accept': 'application/json',
    'X-Atlassian-token': 'no-check'
}
# Create a header object used for the HTTP GET requests

get_projects = requests.get(f"{cloud_url}/rest/api/latest/project", headers=request_headers)
# Get a list of all projects in the instance

projects_json = json.loads(get_projects.content)
# Convert the list of projects to JSON

with open('project_issues.csv', 'w', newline='') as csvfile:
    csvwriter = csv.writer(csvfile)
    # Create a CSV file

    for project in projects_json:
        # Iterate through the list of projects

        start_at = 0
        max_results = 100
        # Declare variables used in pagination

        project_key = project['key']
        # Fetch the key of the current project from the JSON

        while True:
            # Loop until 

Management of users, groups, authentication, and directories happens outside of an organization’s primary Atlassian Cloud domain.   Even if an organization uses https://org1234.atlassian.net for their Jira, all user administration happens on https://admin.atlassian.com

Atlassian has provided very little in the way of API methods by which Cloud users may be managed.  For example, the quickest way to bulk-change users from one authentication policy to another is to create a CSV, and import that CSV from the front end.   This is… not convenient.

Unlike domains at the organizational level, the Atlassian Admin portal doesn’t use a username and a token for authentication.   Instead, it uses a cloud.session.token.  When you navigate from an organizational domain to the Admin portal, this token is generated and stored as a cookie.

I haven’t yet figured out how to generate the cloud.session.token with Python.   Instead, what we’re first going to do is authenticate against the admin portal in our web browser, and then “borrow” that cookie for our script.  Here are the steps to do this:

  • Log in to the Atlassian Cloud in your browser
  • Go to https://admin.atlassian.com/
  • Right-click the page, and inspect
  • Open the network tab
  • Refresh the page
  • Locate the GET request that was sent

This script builds upon the previous script.  It authenticates against both Jira Server and Jira Cloud.  It then takes a list of Projects as input, and compares the issue count for the project on the Server and Cloud side.

This would be most useful in the case of an ongoing migration, to validate a successful transfer of data.

 import requests
import json
import base64

projects = []
#Define a list of target projects to compare

server_username = "<server username>"
server_password = "<server password>"
server_url = "<server URL>"
#Define connection parameters for the server side

cloud_username = "<Cloud username"
cloud_token = "<Cloud token>"
cloud_url = "<Cloud URL>"
#Define connection parameters for the Cloud side

server_credentials_string = f'{server_username}:{server_password}'
server_input_bytes = server_credentials_string.encode('utf-8')
server_encoded_bytes = base64.b64encode(server_input_bytes)
server_encoded_string = server_encoded_bytes.decode('utf-8')
#Encode the server credentials

cloud_credentials_string = f'{cloud_username}:{cloud_token}'
cloud_input_bytes = cloud_credentials_string.encode('utf-8')
cloud_encoded_bytes = base64.b64encode(cloud_input_bytes)
cloud_encoded_string = cloud_encoded_bytes.decode('utf-8')
#Encode the Cloud credentials

server_headers = {
    'Authorization': f'Basic {server_encoded_string}',
    'Content-Type': 'application/json',
    'Accept': 'application/json',
}
#Define the headers used to connect to the server

cloud_headers = {
    'Authorization': f'Basic {cloud_encoded_string}',
    'Content-Type': 'application/json',
    'Accept': 'application/json',
}
#Define the headers used to connect to the Cloud

#Iterate through the list of projects
for project in projects:
    server_issue_count_request = requests.get(f'{server_url}/rest/api/2/search?jql=project={project}',
                                            headers=server_headers)
    

Connecting to server and Cloud instances of Jira with Python is accomplished with much the same method and approach. The only differences between the two are that Server uses a username and password, while Cloud uses a username and token.

Generating a token is pretty straightfoward.  I recommend reading the documentation first.

The script below consists of essentially three pieces.   You define the connection parameters,  create the headers used to authenticate against the instance, and return the results of the authentication request.

The script example below returns one page of project results from each instance, just to demonstrate how it works.  If you wanted to actually work with the results, they’d need to be converted to JSON or some other format. 

The process for connecting to Confluence is the same; you need only point the script at a Confluence instance (and switch to returning some Space data or something).

 import requests
import base64

server_username = "<username>"
server_password = "<password>"
server_url = "<url>"
#Define connection parameters for the server side

cloud_username = "<Cloud login email>"
cloud_token = "<Cloud token"
cloud_url = "<Cloud url>"
#Define connection parameters for the Cloud side

server_credentials_string = f'{server_username}:{server_password}'
server_input_bytes = server_credentials_string.encode('utf-8')
server_encoded_bytes = base64.b64encode(server_input_bytes)
server_encoded_string 

Introduction

I’ve started working on a QR-code based inventory management and pricing system.   One of the foundational elements of this system is the ability to print a price tag with a QR code on it, and to be able to update the link associated with that QR code without replacing the sticker.

This is possible if the QR code links to bit.ly instead of directly to the link in question.   So long as the shortened URL is generated under a Bitly account, it can be edited and modified after the fact.

The Bitly API is at the same time well documented, and a bit frustrating.  It’s frustrating because all of the example Python code on the internet uses the bitly_api package, which is apparently either abandoned or complete trash.   For example, all of the examples on the internet result in an error like this:

  bitly api.bitly _api.Bitly Error: "PERMANENTLY REMOVED"

 I assume this means that the method has been removed from the class or package, but I couldn’t find a way to fix it.

Instead, let’s use the https requests library to connect to the Bitly API and generate a shortened link.

Setup

First things first, you should go check

I wanted to create an interface in Python that had a row of icons at the top. Depending on the screen being displayed, I wanted one of those icons to be highlighted or a different color than the others. 

This proved to be more challenging than I expected.  You can set the color of all of the icons, but setting the color of a single one is a different story.

The solution that I came up with was to create a function that sets the target icon color when the program loads, rather than trying to do it as part of a KivyMD attribute of the TopAppBar widget.

  def set_icon_color(self, dt):
        screen_1_icon = self.screen_1.ids.menu_app_bar.ids.right_actions.children
        #Tell the method where to find the homescreen icon

        screen_1_icon[0].theme_icon_color = "Custom"
        screen_1_icon[0].text_color = "00ADB5"
        #define what the icon should look like

        screen_2_icon = self.screen_2.ids.menu_app_bar.ids.right_actions.children

        screen_2_icon[1].theme_icon_color = "Custom"
        screen_2_icon[1].text_color = "00ADB5"

        screen_2_icon = self.screen_3.ids.menu_app_bar.ids.right_actions.children

        screen_3_icon[2].theme_icon_color = "Custom"
        screen_3_icon[2].text_color = "00ADB5" 

 

We call this function on program load.  This sets the color of the target individual icon on each screen, without affecting the others.   When I switch to any given screen, the appropriate icon is already highlighted with a distinct color.

The other thing I had

There may come a day when you’re asked to create a large number of Confluence pages. Rather than doing it by hand, why not script it?

This Python script essentially does two things: it reads the CSV file, and it sends page creation requests to a Confluence server.   

For each row in the CSV file, it assumes the page name should be the value in the first cell of the row.  It then generates an HTML table that is sent as part of the page creation request. 

Rather than generating HTML, this could be useful for setting up a large number of template pages, to be filled in by various departments.  It could also run as a job, and automatically create a certain selection of pages every week or month, to store meeting notes or reports.

Please note that in order to connect to the Confluence server, you’ll need to generate a Personal Access Token.

 

 import csv
import requests
import json
import html
import logging

# Initialize logging
logging.basicConfig(level=logging.ERROR)

api_url = 'https://<url>.com/rest/api/content/'
#What's the URL to your Confluence DC instance?


file_path = "<your CSV file path>"
#where is the file stored locally?

parent_page_id = "<your parent page ID>"

The request on the Atlassian Forums that caught my eye last night was a request to return all Jira Cloud attachments with a certain extension.   Ordinarily it would be easy enough to cite ScriptRunner as the solution to this, but the user included data residency concerns in his initial post.

My solution to this was to write him a Python script to provide simple reporting on issues with certain extension types.    Most anything that the rest API can accomplish can be accomplished with Python; you don’t HAVE to have ScriptRunner.

The hardest part of working with external scripts is figuring out the authorization scheme. Once you’ve got that figured out, the rest is just the same REST API interaction that you’d get with UniREST and ScriptRunner for cloud.

Authorizing the Script

First, read the instructions: https://developer.atlassian.com/cloud/jira/platform/basic-auth-for-rest-apis/ 

Then:

1. Generate an API token: https://id.atlassian.com/manage-profile/security/api-tokens

2. Go to https://www.base64encode.net/ (or figure out the Python module to do the encoding)

3. Base-64 encode a string in exactly this format: youratlassianlogin:APIToken.   
If your email is john@adaptamist.com and the API token you generated is:

ATATT3xFfGF0nH_KSeZZkb_WbwJgi131SCo9N-ztA3SAySIK5w3qo9hdrxhqHZAZvimLHxbMA7ZmeYRMMNR

 

Then the string you base-64 encode is:

john@adaptamist.com:ATATT3xFfGF0nH_KSeZZkb_WbwJgi131SCo9N-ztA3SAySIK5w3qo9hdrxhqHZAZvimLHxbMA7ZmeYRMMNR

 

Do not forget the colon between the two pieces.

In my previous post I explored how to access the Confluence Cloud Space Permissions API endpoint. 

This Python script extends that, and gives a user a permission set in all Spaces in Confluence. This could be useful if you wanted to give one person Administrative rights on all Spaces in Confluence, for example.

Note that the user must first have READ/SPACE permission before any other permissions can be granted.

 from requests.models import Response
import requests
import json
  
headers = {
    'Authorization': 'Basic <Base-64 encoded username and password>',
    'Content-Type': 'application/json',
    'Accept': 'application/json',
}
  
userID = '<user ID (not name)>'
  
  
url='https://<url>.atlassian.net/wiki/rest/api/space/'
resp = requests.get(url, headers=headers)
data = json.loads(resp.text)
  
  
for lines in data["results"]:
  url="https://<url>.atlassian.net/wiki/rest/api/space/"+lines["key"] + '/permission'
  
  dictionary = {"subject":{"type":"user","identifier": userID},"operation":{"key":"read","target":"space"},"_links":{}}
  
  data = data=json.dumps(dictionary)
    
  try:
      response = requests.post(url=url, headers=headers, data=data)
      print(response.content)
  except:
    print("Could not add permissions to Space " + lines["key"])