Worker
The worker is what runs scans. The code can be found in the backend
directory.
When running Crossfeed locally, instances of the worker are launched as Docker containers by the scheduler. When deployed, every worker instance is launched as a Fargate task.
Directory structure
The src/tasks
folder contains the code for every scan that the worker supports. src/worker.ts
is the main JavaScript entrypoint for the worker, which picks the right scan to run and runs it.
The Dockerfile.worker
file contains the code required to download the right dependencies
and launch the worker file. It launches worker/worker-entry.sh
, which sets up MITMProxy
to sign worker requests and then starts the JavaScript worker entrypoint (src/worker.ts
).
The file infrastructure/worker.tf
contains the Fargate task definition used to launch
the worker and the ECR repository used to store the worker's built Dockerfile.
The file tasks/scheduler.ts
handles scheduling workers based on existing Scans that
have been configured on Crossfeed.
The file tasks/ecs-client.ts
handles the task of actually launching workers,
interfacing with the Docker API (if local) or the AWS ECS API (if launching on Fargate).
Configuration
To configure properties for the worker, you can modify
environment variables in .env
in the root directory.
If you need to configure the worker for deployment, you should update the
env.yml
file. You may also need to update parameters in AWS SSM, as several
environment variables use values that are stored in SSM.
Scheduling
The Scan
model represents a scheduled scan that is run on all organizations.
A scan can be of multiple types -- for example, amass
, or findomain
.
The lambda function scheduler.ts
goes through each organization and sees which scans
need to be run based on their schedule and when they were last run on a particular organization.
ScanTask
The ScanTask
model represents a single scan task on a single organization and stores the status
and errors, if any, of that particular task.
When a scan is run, a ScanTask
model is created, which launches a Fargate task. When the worker runs, it
connects to the database and updates its ScanTask's status accordingly.
All information needed for the scan (defined in the CommandOptions
interface) is specified
through the CROSSFEED_COMMAND_OPTIONS
environment variable. Other secrets needed for the Fargate
task to run are specified in the task configuration through Terraform.
You can view the most recent Scan Tasks, as well as their logs, on the "Scan History" page:
![scan tasks](./img/scan tasks.png)
ScanTask status reference
created
: model is createdqueued
: Fargate capacity has been reached, so the task will run whenever there is available capacity.requested
: a request to Fargate has been sent to start the taskstarted
: the Fargate container has started running the taskfinished
: the Fargate container has finished running the taskfailed
: any of the steps above have failed
Running scans locally
In order to run scans locally or work on scanning infrastructure, you will need to set up the Fargate worker and rebuild it periodically when worker code changes.
Building the worker Docker image
Each time you make changes to the worker code, you should run the following command to re-build the worker docker image:
npm run build-worker
Running workers locally
To run a worker locally, just create a scan from the Crossfeed UI. When running locally, the scheduler function runs every 30 seconds, for convenience, so it will start your worker soon. To manually trigger a run immediately, click on the "Manually run scheduler" button on the Scans page.
Once a worker has started, it is accessible as a running Docker container.
You can examine it by running docker ps
or ( docker ps -a | head -n 3
for stopped workers ) to view Docker containers.
and check its logs with docker logs [containername]
.
You can check the scheduler logs locally by checking the backend container logs.
Generating censys types
The censysIpv4.ts
and censysCertificates.ts
type files in the backend/src/models/generated
files have been
automatically generated from Censys's published schemas. If you need to re-generate these type files, run:
npm run codegen