LittleBug

[ECCV 2024]GenAD: Generative End-to-End Autonomous Driving

发表于2024-12-15|paperReadingend2end

motivation Most existing end-to-end autonomous driving models are composed of several modules and follow a pipeline of perception, motion prediction, and planning .However, the serial design of prediction and planning of existing pipelines ignores the possible future interactions between the ego car and the other traffic participants. Also, future trajectories are highly structured and share a common prior (e.g., most trajectories are continuous and straight lines). Still, most existing...

多跳转机本地端口转发--实现本地访问远程端口

发表于2024-12-13|environment-construction

本地端口转发本地主机A想要访问服务器B上localhost:10086的服务，应该怎么做？一种常见做法就是，将A主机上的一个端口映射到主机B端口10086。 A无法直接访问B的IP地址，但是可以通过跳转机J1，J2来和B建立ssh关系，即：A–>J1—>J2—>B host ip username port A / / 10001，10002，10003 J1 1.1.1.1 user1 22（22端口建立ssh关系） J2 2.2.2.2 user2 22 B 3.3.3.3 user3 22 A登陆到跳转机J1，然后将A的端口10001 映射到跳转机J2的22端口（2.2.2.2:22） sshpass -p 'pw1' ssh -f -L 1:2.2.2.2:22 -N -o StrictHostKeyChecking=no user1@1.1.1.1 A登陆到跳转机J2（ssh连接localhost:10001），然后将A的端口10002 映射到 B的22端口...

[CVPR 2023]VAD: Vectorized Scene Representation for Efficient Autonomous Driving

发表于2024-11-27|paperReadingend2end

Introduction 这篇文章刷新了端到端自动驾驶领域的sota （2023）。 traditional autonomous driving methods采用模块化模式，其中感知和规划被解耦为独立的模块。其缺点是规划模块无法访问原始传感器数据，这些数据包含丰富的语义信息。由于规划完全基于先前的感知结果，因此在规划阶段，感知中的误差可能会严重影响规划，无法被识别和纠正，从而导致安全问题。而端到端自动驾驶方法（end-to-end autonomous driving methods）将传感器数据作为感知的输入，并通过一个整体模型输出规划结果。这篇文章提出了Vectorized Scene Representation for Efficient Autonomous Driving（VAD），即将所有的驾驶场景建模为矢量表示，一方面，VAD利用矢量化的智能体运动和映射元素作为明确的实例级规划约束，有效地提高了规划安全性；另一方面，VAD通过摆脱计算密集型的栅格化表示和手工设计的后处理步骤，比以前的端到端规划方法运行得快得多。具体来说，VAD-Base,...

Auto-Encoding Variational Bayes

发表于2024-11-22|paperReadingdiffusion|inference•variational inference

Problem scenario 已知隐变量的先验分布和条件生成分布以上背景下的相关问题有： Preliminary evidence lower bound (variational lower bound) 推断（inference）可以理解为计算后验分布P(Z∣X)P(Z|X)P(Z∣X), P(Z∣X)=P(X,Z)∫zP(X,Z=z)dzP(Z|X)=\frac{P(X,Z)}{\int_z{P(X,Z=z)}dz} P(Z∣X)=∫zP(X,Z=z)dzP(X,Z) 其中分母（规范项）很难计算，所以精确计算后验分布很困难，常常有两种方法求解近似的后验分布。采样法：例如MCMC，MCMC方法是利用马尔科夫链取样来近似后验概率，它的计算开销很大，且精度和样本有关系。变分法：使用一个简单的概率分布来近似后验分布，于是就转换为一个优化问题 KL divergence: DKL(q∣∣p)=Ex∼q[log⁡pq]=∑xq(x)log⁡p(x)q(x)D_{KL}(q||p)=E_{x\sim...

Git的一些经验

发表于2024-11-15|environment-construction|github

连接问题参考：https://v2ex.com/t/843383#reply47 远程仓库有两种连接方式：https 和 ssh： https://github.com/xxx.gitgit@github.com:xxx.git 可以使用 git config -l 来看当前仓库对应的远程仓库的链接来分辨使用的是 ssh 连接，还是 https 链接。我一般使用ssh key来完成身份验证,基本流程是,将公钥id.rsa.pub存储到GitHub账户上,私钥放到~/.ssh/id_rsa中,这样每次git push的时候,就会自动验证身份. 设置 Http Proxy git config --global http.proxy socks5://127.0.0.1:7890 事实上使用 socks5h 更佳，即 git config --global http.proxy socks5h://127.0.0.1:7890 h 代表 host ，包括了域名解析，即域名解析也强制走这个 proxy 。另外不需要配置 https.proxy，git...

hexo自动部署到githubPages和vercel

发表于2024-11-14|hexo|github

准备首先本地有一份博客源码，然后github上面要有两个仓库:hexo-source和xxxx.github.io.git。还需要一份密钥，用来链接github仓库，密钥可以是github token，也可以是ssh密钥。 github token 是用来以https链接仓库；ssh密钥是用来以ssh链接仓库，分为私钥和公钥，私钥放到本地，公钥放到github。 hexo-source仓库用来备份本地源码，将其设置为 private (毕竟，我不想其他人直接git clone就把我的博客系统抄袭了)。 .gitignore中的文件不需要备份，因为其中都是一些环境依赖，还有发布后的代码。 .DS_StoreThumbs.dbdb.json*.lognode_modules/public/.deploy*/_multiconfig.yml xxxx.github.io.git是github Pages仓库，它一定是 public 的，使用hexo...

pytorch中的自动微分

发表于2024-11-10|pytorch|pytorch•自动微分

自动微分我们知道在pytorch是支持自动微分的，也就是自动求导数，在深度学习框架中，我们一般会求loss函数关于可学习参数的偏导数。 import torchx = torch.arange(4.0)# x=tensor([0., 1., 2., 3.])x.requires_grad_(True) # 等价于x=torch.arange(4.0,requires_grad=True)x.grad # 默认值是None 如果我们将来需要计算关于某个变量...

diffusion基础

发表于2024-11-10|paperReadingdiffusion|diffusion

生成模型对比 GAN网络由discriminator和generator组成，discriminator致力于区分x和x‘，而generator致力于生成尽可能通过discriminator的样本，迭代多次，最终generator生成的样本越来越像x，即我们需要的生成式样本。 VAE是学习分布函数的网络，但是这里的分布函数是从样本空间到语义空间的。 Flow-based models是真正开始学习分布的网络结构 overall forward process (diffusion process 扩散过程)：从右到左 X0→XTX_0 \rightarrow X_TX0→XT reverse process (denoising process 去噪过程)：从左到右 XT→X0X_{T}\rightarrow X_0XT→X0 扩散过程和去噪过程，都视为Markov 过程。 x0∼q(x0)x_0 \sim q(x_0)x0∼q(x0) 任务为：学习一个分布(distribution )...

Attention is all you need

发表于2024-11-06|paperReadingtransformer|transformer

model architecture Inputs: A paragraph of English consists of BBB (i.e. batch_size) sentences. Each sentence has NNN (i.e. seq_length) words at most. Outputs: A paragraph of Chinese translated from Inputs.(B,N)(B,N)(B,N) Encoder outcome: the feature matrix containing position , context, semantic information Decoder: auto-regressive , consuming the previously generated symbols as additional input when generating the next. For example: Inputs : I love u. (B=1B=1B=1) Learning feature from...

[CVPR2024]Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow

发表于2024-11-04|paperReadingmultimodal|multimodal•lidar•camera

Motivation Scene flow aims to model the correspondence between adjacent visual RGB or LiDAR features to estimate 3D motion features. RGB and LiDAR has intrinsic heterogeneous nature , fuse directly is inappropriate. We discover that the event has the homogeneous nature with RGB and LiDAR in both visual and motion spaces . visual space complementarity RGB camera: absolute value of luminance event camera: relative change of luminance LiDAR: global shape event camera: local boundary motion...