Performance — Distributed Batch Process — Using as many servers as you have

The overall idea:

Implementing

SELECT * FROM blog.batchjob WHERE Status = 0 FOR UPDATE SKIP LOCKED LIMIT ?

The Entity Framework Core
I like to use Entity Framework because you have a lot of possibilites to do the same stuff, it all depends of your scenario. In my case EF is always the first choice because you can measure the query performance using the Visual Studio Profiler, among other things like context migrations, Code First migrations and so on.

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseNpgsql(string.Format(
"Server={0};Username={1};Database={2};Port={3};Password={4};SSLMode=Prefer",
Host,
User,
DBname,
Port,
Password),
x => x.MigrationsHistoryTable("_ef_migration_history", "blog")
)
.UseLowerCaseNamingConvention();
[Index(nameof(Status))]
public class BatchJob
{
public long Id { get; set; }
public string ClassName { get; set; } public string ContractName { get; set; } public string ContractJson { get; set; } public int Status { get; set; } = 0; public DateTime StartedAt { get; set; } public DateTime StoppedAt { get; set; }
}
Add-Migration InitialMigration -Context BlogContext -OutputDir Migrations/Blog
Update-Database -Context BlogContext
Update-Database 0 -Context BlogContext

The client, code, C#, reflection and serializer overall idea
To make the example work, some shortcuts were taken. Like mentioned before, Migration is called on client application, here named as: Blog.Performance.Batch.

static void Main(string[] args)
{
Migration();
CreateTaskList();
}
static void Migration()
{
using var blogCtx = new BlogContext();
blogCtx.Database.Migrate();

}
using var blogCtx = new BlogContext();var entityType = blogCtx.Model.FindEntityType(typeof(BatchJob));
var schema = entityType.GetSchema();
var tableName = entityType.GetTableName();
var vendors = ctx.Vendor
.Select(v => new Vendor()
{
BusinessEntityId = v.BusinessEntityId
}).ToList();
blogCtx.Database.ExecuteSqlRaw($"TRUNCATE TABLE {schema}.{tableName}");
static void CreateTaskList()
{
//code not shown

foreach (var item in vendors)
{
var contract = new SummedSalesContract
{
VendorId = item.BusinessEntityId
};
blogCtx.BatchJob.Add(new BatchJob
{
ClassName = typeof(SummedSalesBatch).AssemblyQualifiedName,
ContractName = typeof(SummedSalesContract).AssemblyQualifiedName,
ContractJson = JsonSerializer.Serialize(contract),
Status = 0 //Pending

});
}
blogCtx.SaveChanges();
}

The service
The service side it’s just a one class program. Even this example being dried, the service do not hold a big complexity, once its job it’s read the “next tasks” from the database and call the respective objects.

The scenario
The scenario that is used as example is just simple. For each vendor on the database a task was created. The contract holds the Vendor Id, and the BatchJob uses this contract to calculate the amount of value and quantity bought from him.

public sealed class SummedSalesBatch : ITaskfy
{
public void Execute(IContract _contract)
{
var contract = _contract as SummedSalesContract;
var vendorId = contract.VendorId; using var ctx = new PurchasingContext(); decimal total = ctx.PurchaseOrderHeader
.Where(poh => poh.VendorId == vendorId)
.Sum(poh => poh.SubTotal);
decimal qty = (from details in ctx.PurchaseOrderDetail
join header in ctx.PurchaseOrderHeader
on details.PurchaseOrderId equals header.PurchaseOrderId
where header.VendorId == vendorId
select details.PurchaseOrderDetailsId)
.Count();
Console.WriteLine($"VendorId: {vendorId}, Total: {total}, Items {qty}.");
}
}
private static string DbAddress()
{
string dbAddress = Environment.GetEnvironmentVariable("DB_ADDRESS");
if(string.IsNullOrWhiteSpace(dbAddress))
{
dbAddress = "localhost";
}
return dbAddress;
}
version: '3.4'services:
simplebatchrunner:
image: ${DOCKER_REGISTRY-}simplebatchrunner
build:
context: .
dockerfile: SimpleBatchRunner/Dockerfile
environment:
DB_ADDRESS: ${DOCKER_GATEWAY_HOST:-host.docker.internal}
export DOCKER_GATEWAY_HOST=172.17.0.1
cd /path/to/copy/of/repository
docker-compose build
docker-compose up -d --scale simplebatchrunner=20

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store