Copilot instructions

2025-10-06 20:23:25 -04:00
parent e92499169a
commit d504e1e889
1 changed files with 48 additions and 0 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,48 @@
+# Copilot Instructions for Railfan Stuff Power Query Project
+
+## Project Overview
+This project consists of Power Query scripts (.m files) for processing and filtering railroad roster data from CSV files. The main goal is to clean, shape, and filter locomotive roster data for different railroads (CSX, AMTK, NS, etc) before further analysis or reporting in Excel.
+
+## Architecture & Data Flow
+- **Source Data**: CSV files for each railroad (e.g., `CSX_roster.csv`, `AMTK_roster.csv`, `NS_roster.csv`) are loaded from a shared network path (`X:\railfan\`).
+- **Power Query Scripts**: Each script section (e.g., `CSX_roster`, `AMTK_roster`, `NS_roster`) follows a similar pattern:
+  - Load CSV
+  - Promote headers
+  - Change column types
+  - Remove unnecessary columns (`No Pictures`, `Col`)
+  - Apply blocklist filters to exclude unwanted rows
+- **Filtering Logic**: Rows are excluded if:
+  - `Unit No` contains "UNKNOWN"
+  - `Notes` or `Serial` columns contain any blocklisted terms (e.g., "sold", "retired", "duplicate")
+- **Output**: The filtered table is returned for use in Excel or further Power Query steps.
+
+## Key Files & Patterns
+- `Railfan Stuff.xlsx_PowerQuery.m`: Main Power Query script containing all logic for loading and filtering rosters.
+- CSV files are referenced by absolute path; update paths if moving data sources.
+- Blocklists are defined as lists at the top of each section for easy modification.
+- Filtering is performed in a single step using `Table.SelectRows` and helper functions for case-insensitive matching.
+
+## Developer Workflows
+- **Editing Queries**: Modify `.m` files directly in VS Code. Ensure blocklists and column names match the source CSV structure.
+- **Adding New Rosters**: Copy an existing section, update the CSV filename and blocklists as needed.
+- **Debugging**: Use Power Query Editor in Excel to preview results and troubleshoot issues with data loading or filtering.
+- **Data Source Changes**: If CSV locations change, update the `File.Contents` path in each section.
+
+## Conventions & Patterns
+- Consistent use of blocklists for filtering unwanted data.
+- All filtering logic is explicit and centralized in each roster section.
+- Helper function `HasAny` is used for case-insensitive substring matching against blocklists.
+- Only relevant columns are retained for output; unnecessary columns are removed early.
+
+## Integration Points
+- Power Query scripts are designed to be imported into Excel workbooks for data analysis.
+- No external dependencies beyond Power Query and the referenced CSV files.
+
+## Example: Adding a New Railroad Roster
+1. Copy an existing roster section (e.g., `shared CSX_roster = ...`).
+2. Update the CSV filename and blocklists as needed.
+3. Ensure column names and types match the new CSV structure.
+4. Test in Excel Power Query Editor.
+
+---
+For questions or improvements, review the main `.m` file and follow the established patterns. If unclear, ask for clarification or examples from the project owner.