Extracting Data From OpenAir Without API Access

I had a recent need to pull a lot of data out of OpenAir.  There was a requirement to audit some data specific to each employee of the organization.

Ordinarily this sort of task would come with API access to the system in question, and it would be fairly trivial to retrieve the required data and offload it to my workstation for the requisite processing.

Unfortunately, I do not have API access to the OpenAir instance in question. Furthermore, the instance is access through Okta, which adds an additional layer of abstraction to the issue.  Without the Okta layer in place, I might be able to goose it directly from a script. 

So how do we access hundreds of pages of data on a website that sits behind another website, and which provides no documented API access?

Let’s try Selenium.

The Okta issue is actually pretty easy to solve.  If we tell Selenium to navigate to the Okta login page, and feed the appropriate credentials to the relevant form elements, it’ll log us in to the Okta instance.

Please note that in the script below, we’re storing the credentials in a separate file

Here’s a quick one.   Fetch a list of all projects in a Jira Cloud instance, then fetch a list of all of the issues in each project.  Paginate through the resulting list of issues, and for each issue write the issue key and issue status to a CSV file.

 import requests
import json
import base64
import csv

cloud_username = "<email>"
cloud_token = "<token>"
cloud_url = "<cloud URL>"

def credentials_encode(username, password):
    credentials_string = f'{username}:{password}'
    input_bytes = credentials_string.encode('utf-8')
    encoded_bytes = base64.b64encode(input_bytes)
    encoded_string = encoded_bytes.decode('utf-8')
    return encoded_string

encoded_cloud_credentials = credentials_encode(cloud_username, cloud_token)
# Encode the credentials that we provided

request_headers = {
    'Authorization': f'Basic {encoded_cloud_credentials}',
    'Content-Type': 'application/json',
    'Accept': 'application/json',
    'X-Atlassian-token': 'no-check'
}
# Create a header object used for the HTTP GET requests

get_projects = requests.get(f"{cloud_url}/rest/api/latest/project", headers=request_headers)
# Get a list of all projects in the instance

projects_json = json.loads(get_projects.content)
# Convert the list of projects to JSON

with open('project_issues.csv', 'w', newline='') as csvfile:
    csvwriter = csv.writer(csvfile)
    # Create a CSV file

    for project in projects_json:
        # Iterate through the list of projects

        start_at = 0
        max_results = 100
        # Declare variables used in pagination

        project_key = project['key']
        # Fetch the key of the current project from the JSON

        while True:
            # Loop until 

Management of users, groups, authentication, and directories happens outside of an organization’s primary Atlassian Cloud domain.   Even if an organization uses https://org1234.atlassian.net for their Jira, all user administration happens on https://admin.atlassian.com

Atlassian has provided very little in the way of API methods by which Cloud users may be managed.  For example, the quickest way to bulk-change users from one authentication policy to another is to create a CSV, and import that CSV from the front end.   This is… not convenient.

Unlike domains at the organizational level, the Atlassian Admin portal doesn’t use a username and a token for authentication.   Instead, it uses a cloud.session.token.  When you navigate from an organizational domain to the Admin portal, this token is generated and stored as a cookie.

I haven’t yet figured out how to generate the cloud.session.token with Python.   Instead, what we’re first going to do is authenticate against the admin portal in our web browser, and then “borrow” that cookie for our script.  Here are the steps to do this:

  • Log in to the Atlassian Cloud in your browser
  • Go to https://admin.atlassian.com/
  • Right-click the page, and inspect
  • Open the network tab
  • Refresh the page
  • Locate the GET request that was sent

Connecting to server and Cloud instances of Jira with Python is accomplished with much the same method and approach. The only differences between the two are that Server uses a username and password, while Cloud uses a username and token.

Generating a token is pretty straightfoward.  I recommend reading the documentation first.

The script below consists of essentially three pieces.   You define the connection parameters,  create the headers used to authenticate against the instance, and return the results of the authentication request.

The script example below returns one page of project results from each instance, just to demonstrate how it works.  If you wanted to actually work with the results, they’d need to be converted to JSON or some other format. 

The process for connecting to Confluence is the same; you need only point the script at a Confluence instance (and switch to returning some Space data or something).

 import requests
import base64

server_username = "<username>"
server_password = "<password>"
server_url = "<url>"
#Define connection parameters for the server side

cloud_username = "<Cloud login email>"
cloud_token = "<Cloud token"
cloud_url = "<Cloud url>"
#Define connection parameters for the Cloud side

server_credentials_string = f'{server_username}:{server_password}'
server_input_bytes = server_credentials_string.encode('utf-8')
server_encoded_bytes = base64.b64encode(server_input_bytes)
server_encoded_string 

Introduction

I’ve started working on a QR-code based inventory management and pricing system.   One of the foundational elements of this system is the ability to print a price tag with a QR code on it, and to be able to update the link associated with that QR code without replacing the sticker.

This is possible if the QR code links to bit.ly instead of directly to the link in question.   So long as the shortened URL is generated under a Bitly account, it can be edited and modified after the fact.

The Bitly API is at the same time well documented, and a bit frustrating.  It’s frustrating because all of the example Python code on the internet uses the bitly_api package, which is apparently either abandoned or complete trash.   For example, all of the examples on the internet result in an error like this:

  bitly api.bitly _api.Bitly Error: "PERMANENTLY REMOVED"

 I assume this means that the method has been removed from the class or package, but I couldn’t find a way to fix it.

Instead, let’s use the https requests library to connect to the Bitly API and generate a shortened link.

Setup

First things first, you should go check

 

This script fetches all of the projects in a Jira Cloud instance. It then fetches all of the project roles for that project, and finally fetches all of the users in that role for that project. In this way, it iterates through the projects and returns information about the users in the project roles.

 

 import groovy.json.JsonSlurper

def sb = []
//Define a string buffer to hold the results

def getUsers = get("/rest/api/2/project")
  .header('Content-Type', 'application/json')
  .asJson()
//Get the list of projects in the instance

def content = getUsers.properties.rawBody
//Get the raw body contents of the HTTP response

def scanner = new java.util.Scanner(content).useDelimiter("\\A")
String rawBody = scanner.hasNext() ? scanner.next() : ""
def json = new JsonSlurper().parseText(rawBody)
//Turn the raw body contents into JSON

json.each{ project ->
//Iterate through the projects
  
  sb.add("$project.name")

  def getRoles = get("/rest/api/2/project/$project.id/role")
    .header('Content-Type', 'application/json')
    .asObject(Map)
//For each project, get the list of roles


  getRoles.body.each{ projectRole ->
  //Iterate through the project roles

      def getRoleMembers = get("$projectRole.value")
      .header('Content-Type', 'application/json')
      .asObject(Map)
      //Return the details about each role

    getRoleMembers.body.actors.each{ roleMember ->
    //Get all the actors (users) in that role

        sb.add("$getRoleMembers.body.name:   $roleMember.displayName")
    }
  }
}

return sb
//Return the results


 

Mitigating CORS Errors With Custom Jira REST API Endpoints

If you dive into the world of REST requests and APIs, you may encounter a CORS error that prevents your request from completing. CORS stands for Cross-Origin Resource Sharing.  Same-origin is a security feature in browsers that prevents requests coming from one place (origin) to access resources in a different domain.  CORS allows web pages to access resources on a different network by providing a standard for safely allowing cross-origin requests.

Let’s talk about the example that I encountered.  I wrote a JavaScript macro for Confluence Server, and I was trying to access a third-party API using that macro.  However, Confluence macros run in the browser when the page loads, rather than running on the back-end Confluence server itself.   Thus, while the Confluence server may be set up to address CORS, your browser almost certainly is not, and the request gets blocked.

We can address this by creating a custom REST API endpoint in Confluence (or Jira).   In this way, we have the server making the request to the third party API, and the macro makes the request to the internal API.

In other words, the custom REST API endpoint acts

Overview

Tempo Planner allows for planning team capacity and schedules within Jira.  However, you may have some need to pull that resource planning information out of the Tempo interface and add it to a ticket.

The Tempo API has some severe limitations, but where there’s a will there’s a way.

Team Info

The first thing we’ll examine is how to get information on all of the teams in Tempo Planner.  According the documentation, this isn’t possible.  Per the API documentation, you can return limited very information about plans and allocations. 

Naturally I found this to be unacceptable, and I figured out a way to have the API return all of the teams.   One of the undocumented API endpoints is a search function: /rest/tempo-teams/3/search.    One of the tricks to using this method is that it’s not a GET, it’s a POST, so we have to supply a search parameter as a payload.  When we POST to this endpoint, we supply some JSON: {“teamSearchString”:”<string>”}.  But here’s the rub: the API will accept an empty search string, and return all of the teams as a result.

Allocation Info

Much like team info, there is no public Tempo API endpoint that will

There may come a day when you’re asked to create a large number of Confluence pages. Rather than doing it by hand, why not script it?

This Python script essentially does two things: it reads the CSV file, and it sends page creation requests to a Confluence server.   

For each row in the CSV file, it assumes the page name should be the value in the first cell of the row.  It then generates an HTML table that is sent as part of the page creation request. 

Rather than generating HTML, this could be useful for setting up a large number of template pages, to be filled in by various departments.  It could also run as a job, and automatically create a certain selection of pages every week or month, to store meeting notes or reports.

Please note that in order to connect to the Confluence server, you’ll need to generate a Personal Access Token.

 

 import csv
import requests
import json
import html
import logging

# Initialize logging
logging.basicConfig(level=logging.ERROR)

api_url = 'https://<url>.com/rest/api/content/'
#What's the URL to your Confluence DC instance?


file_path = "<your CSV file path>"
#where is the file stored locally?

parent_page_id = "<your parent page ID>"

The amount of code required to fetch information from Confluence Cloud and bring it into Jira Cloud is a bit shocking. In a good way.

Here’s the code:

 import org.jsoup.*

def authString = "<authstring>"

def fieldConfigsResult = get("https://<url>.atlassian.net/wiki/rest/api/content/229377?expand=body.storage")
  .header('Content-Type', 'application/json')
  .header("Authorization", "Basic ${authString}")
  .asObject(Map)

def storage = fieldConfigsResult.body.body.storage.value


return storage
 

 

In the end it’s all just REST.  So long as you can authenticate, UNIREST allows us to pretty easily fetch information from other sites.

If you’d like to learn more about authenticating against Jira Cloud, check out my post on the subject.