Ai Risks + Viva Engage

September 18, 2025 – Lab Notes

Research & Exploration

  • Looked into open-weight models and guardrails:
    • Guardrails are implemented after weights are released, not bundled with them.
    • Risks: poisoned fine-tuning can create backdoored models.
    • Explored unauthorized tool access concepts:
      • Exfiltration = stealing data out of a system.
      • Pivoting = using a compromised system to move deeper into the network.
      • Adversarial examples = inputs designed to trick models into bad outputs.
  • Studied operational evasion in threat models:
    • Attackers shape their behavior to slip past detection systems.
    • Red team note: knowing defender culture shows what they neglect → attack surface.
  • Explored ML/DL stack:
    • ML → DL → Transformers → LLMs progression.
    • PyTorch and TensorFlow: frameworks to build and train ML/DL models (both Python-based).
    • R language: stats-heavy, strong for data analysis.
  • Learned about corpora: large structured sets of text used to train NLP/LLMs.
  • Checked HuggingFace.co: hub for models, datasets, and tools for NLP/LLMs.
  • Reviewed NLP basics: field of AI for processing and understanding human language.

Sys Admin

  • Investigated Viva Engage Native Mode upgrade failure in Microsoft tenant.
    • Logged issue: bouncing back to login when accessing admin center.